2

i have an application that queries a specific folder for its contents with a quick interval. Up to the moment i was using FindFirstFile but even by applying search pattern i feel there will be performance problems in the future since the folder can get pretty big; in fact it's not in my hand to restrict it at all.

Then i decided to give FindFirstFileEx a chance, in combination with some tips from this question.

My exact call is the following:

const char* search_path = "somepath/*.*";
WIN32_FIND_DATA fd;
HANDLE hFind = ::FindFirstFileEx(search_path, FindExInfoBasic, &fd, FindExSearchNameMatch, NULL, FIND_FIRST_EX_LARGE_FETCH);

Now i get pretty good performance but what about compatibility? My application requires Windows Vista+ but given the following, regarding FIND_FIRST_EX_LARGE_FETCH:

This value is not supported until Windows Server 2008 R2 and Windows 7.

I can pretty much compile it on my Windows 7 but what will happens if someone runs this on a Vista machine? Does the function downgrade to a 0 (default) in this case? It's safe to not test against operating system?

UPDATE:

I said above about feeling the performance is not good. In fact my numbers are following on a fixed set of files (about 100 of them):

FindFirstFile   -> 22 ms
FindFirstFile   -> 4 ms (using specific pattern; however all files may wanted)
FindFirstFileEx -> 1 ms (no matter patterns or full list)

What i feel about is what will happen if folder grows say 50k files? that's about x500 bigger and still not big enough. This is about 11 seconds for an application querying on 25 fps (it's graphical)

Community
  • 1
  • 1
vlzvl
  • 371
  • 4
  • 12
  • 3
    You "***feel*** there will be performance problems". How about rather ***testing*** and knowing for certain? The most important rule in understanding performance is: _measure, measure, measure_ We don't know what your search pattern (assuming not `*.*`) or result-set size may be. The feature has a trade-off; if you use it on small directories, you might actually degrade performance. (Unfortunately, I don't have access to a system where I can test the effect on older Windows. But probably it is an unknown flag and will simply be ignored.) – Disillusioned Jan 11 '17 at 05:56
  • PS: Your file system can have a big impact on your testing. E.g. FAT32 has a much lower limit for number of files than NTFS. – Disillusioned Jan 11 '17 at 06:00
  • Last parameter is an int that you can pass multiple flag to it via bitwise operators. usually if you don't want any flag you pass 0 or if you want multiple flag you pass them like `FIND_FIRST_EX_CASE_SENSITIVE | FIND_FIRST_EX_LARGE_FETCH`. So I think it's safe until you define `FIND_FIRST_EX_LARGE_FETCH` as a macro with 0 value, if it's not defined already. – MRB Jan 11 '17 at 06:02
  • @CraigYoung, my file system is _NTFS_ (Windows 7). Yes i know i can pass multiple flags but i only want `FIND_FIRST_EX_LARGE_FETCH` as this is what brings performance on heavy folders. If i confirm this is reduced to a zero (if it's not supported) then it's good to go, what i don't want is some kind of crash :). You can check my question update as well – vlzvl Jan 11 '17 at 06:13
  • Cool. Another thing to bear in mind: When working on performance improvements, don't forget to consider other things going on in the system. E.g. If you iterate the files and process each file, the overhead of a "slow" `FindFirstFile` call may be negligible relative to other processing. Also it might not be as much of a problem if the function is only rarely called. _That said, using the flag looks like a worthwhile change if you can confirm backwards compatibility_. – Disillusioned Jan 11 '17 at 07:08
  • @CraigYoung, i am at ease about processing the file as i reduced it into about `4 ms` processing time and i don't consider it at all as it's about the current locked fps (25). Unfortunately for me, the big overhead is just _scanning the files_. My search pattern may be none or something, that's unknown; in first case i just have to query all files and process any changes and this happens every frame as i don't know if files have been modfied; i have to scan them first :) Perhaps a whole new approach is needed, but at the moment `FindFirstFileEx` seems like a fix – vlzvl Jan 11 '17 at 07:20
  • 1
    In that case you may be better off looking into [File Change Notification](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365261(v=vs.85).aspx) – Disillusioned Jan 11 '17 at 07:59
  • @CraigYoung, thanx for the link, i didn't knew about that... I'll try this as well – vlzvl Jan 11 '17 at 08:57
  • you need use `ReadDirectoryChangesW` with IOCP and only once call to `FindFirstFileEx` – RbMm Jan 11 '17 at 09:16
  • side note: Use the Unicode versions of these functions, the non-unicode versions behave oddly and may not act as expected. They also usually have to have the overhead of converting to Unicode. – Mgetz Jan 11 '17 at 13:38

2 Answers2

2

Just tested under WinXP (compiled under win7). You'll get 0x57 (The parameter is incorrect) when it calls ::FindFirstFileEx() with FIND_FIRST_EX_LARGE_FETCH. You should check windows version and dynamically choose the value of additional parameter. Also FindExInfoBasic is not supported before Windows Server 2008 R2 and Windows 7. You'll also get run-time 0x57 error due to this value. It must be changed to another alternative if binary is run under old windows version.

Alex P.
  • 279
  • 2
  • 6
  • 1
    Checking for Windows versions is the wrong approach, because it cannot cope with features that do get backported to a down-level OS. Instead, call the API and handle specific errors appropriately. – IInspectable Jan 11 '17 at 13:15
  • @IInspectable it's ok insofar as they use the version helper function [`IsWindows7orGreater`](https://msdn.microsoft.com/en-us/library/windows/desktop/dn424959(v=vs.85).aspx) that said I'd question if they really need vista support considering how small a market that is. – Mgetz Jan 11 '17 at 13:44
  • @Mgetz: How does use of the helper functions change anything? You're still proposing to implement, what you never meant to implement. If you want to use and API in a particular way, call that API in that particular way, and handle errors appropriately. Don't use circumstantial evidence (e.g. the OS version that was the first to introduce a feature, at some point in time). This is not an invariant. – IInspectable Jan 11 '17 at 14:34
  • It amazes me the kind of winapi functions you find each day. I didn't knew of [IsWindows7OrGreater](https://msdn.microsoft.com/en-us/library/windows/desktop/dn424959(v=vs.85).aspx) and the whole family of them. However thank you for testing this on a XP which pretty much solves my problem. @IInspectable, are you saying it's better to handle return errors and/or downgrade functionality rather checking for Windows version? – vlzvl Jan 11 '17 at 16:08
  • @user6096479: You know the answer to that question already. Are you interested in knowing, which operating system runs your code, or are you interested in calling an API? – IInspectable Jan 11 '17 at 16:17
  • @user6096479: IInspectable offered a good approach. Call API that gives you maximum speed first, If incorect parameter error you get then call slower version API. But there is *NtQueryDirectoryFile* as mentioned in another answer. It seems to be supported since WinXP You should check this out too. – Alex P. Jan 11 '17 at 16:58
  • @AlexP., I think it's a good approach as well. I don't like comparing windows version as well, but i always fear crashes and without error codes. If it does not crash, then i'm good to go :) 2 calls, one normal and one downgraded to simply [FindFirstFile](https://msdn.microsoft.com/en-us/library/windows/desktop/aa364418(v=vs.85).aspx) seems about right. – vlzvl Jan 11 '17 at 17:15
  • 1
    @user6096479: and, of course, once you decide you have to downgrade, you can set a flag to tell yourself that all future searches should be downgraded as well. – Remy Lebeau Jan 11 '17 at 19:42
1

at first periodic queries a specific folder for its contents with a quick interval - not the best solution I think.

you need call ReadDirectoryChangesW - as result you will be not need do periodic queries but got notifies when files changed in directory. the best way bind directory handle with BindIoCompletionCallback or CreateThreadpoolIo and call first time direct call ReadDirectoryChangesW. then when will be changes - you callback will be automatic called and after you process data - call ReadDirectoryChangesW again from callback - until you got STATUS_NOTIFY_CLEANUP (in case BindIoCompletionCallback) or ERROR_NOTIFY_CLEANUP (in case CreateThreadpoolIo) in callback (this mean you close directory handle for stop notify) or some error.

after this (first call to ReadDirectoryChangesW ) you need call FindFirstFileEx/FindNextFile loop but only once - and handle all returned files as FILE_ACTION_ADDED notify

and about performance and compatibility.

all this is only as information. not recommended to use or not use

if you need this look to - ZwQueryDirectoryFile - this give you very big win performance

  1. you only once need open File handle, but not every time like with FindFirstFileEx
  2. but main - look to ReturnSingleEntry parameter. this is key point - you need set it to FALSE and pass large enough buffer to FileInformation. if set ReturnSingleEntry to TRUE function and return only one file per call. so if folder containing N files - you will be need call ZwQueryDirectoryFile N times. but with ReturnSingleEntry == FALSE you can got all files in single call, if buffer will be large enough. in all case you serious reduce the number of round trips to the kernel, which is very costly operation . 1 query with N files returned much more faster than N queries. the FIND_FIRST_EX_LARGE_FETCH and do this - set ReturnSingleEntry to TRUE - but in current implementation (i check this on latest win 10) system do this only in FindNextFile calls, but in first call to FindFirstFileEx it (by unknown reason) still use ReturnSingleEntry == TRUE - so will be how minimum 2 calls to the ZwQueryDirectoryFile, when possible have single call (if buffer will be large enough of course) and if direct use ZwQueryDirectoryFile - you control buffer size. you can allocate once say 1MB for buffer, and then use it in periodic queries. (without reallocation). how large buffer use FindFirstFileEx with FIND_FIRST_EX_LARGE_FETCH - you can not control (in current implementation this is 64kb - quite reasonable value)
  3. you have much more reach choice for FileInformationClass - less informative info class - less data size, faster function worked.

about compatibility? this exist and worked from how minimum win2000 to latest win10 with all functional. (in documentation - "Available starting with Windows XP", however in ntifs.h it declared as #if (NTDDI_VERSION >= NTDDI_WIN2K) and it really was already in win2000 - but no matter- XP support more than enough now)

but this is undocumented, unsupported, only for kernel mode, no lib file.. ?

documented - as you can see, this is both for user and kernel mode - how you think - how is FindFirstFile[Ex] / FindNextFile - working ? it call ZwQueryDirectoryFile - no another way. all calls to kernel only through ntdll.dll - this is fundamental. ( yes still possible that ntdll.dll stop export by name and begin export by ordinal only for show what is unsupported really was). lib file exist, even two ntdll.lib and ntdllp.lib (here more api compare first) in any WDK. headers, where declared ? #include <ntifs.h>. but it conflict with #include <windows.h> - yes conflict, but if include ntifs.h in namespace with some tricks - possible avoid conflicts

RbMm
  • 31,280
  • 3
  • 35
  • 56
  • 1
    Calling native NT functions (`ZwQueryDirectoryFile`) is unsupported from non-driver code. Doing so may cause the application to break randomly across versions. Please stick to WIN32 published API functions ONLY. – Mgetz Jan 11 '17 at 13:43
  • @Mgetz I wrote about "unsupported" in own answer. and note that is `only as information. not recommended to use or not use` - however this without any problems worked in user mode apps. say some file managers use this API direct in own code, for performance reason – RbMm Jan 11 '17 at 13:52
  • given that it takes up the majority of your answer people will see it as an actual solution. It is not and should be removed from the answer. In general unsupported and incorrect information should be left out of answers. – Mgetz Jan 11 '17 at 13:53
  • @Mgetz - what I wrote is `correct information` - think you need downvote my answer – RbMm Jan 11 '17 at 13:56
  • 1
    @Mgetz `people will see it as an actual solution` - but you not see this ? you think most stupid enough that he would not understand what is written and what is offered and what is informed ? i not think so, everybody can for self decide how interpret offer for use `ReadDirectoryChangesW` and information about how `FindFirstFileEx` worked internal and why can be faster. and use or not use Nt* api. – RbMm Jan 11 '17 at 14:05
  • Thank you for this valuable info. Since it's window-driver thing and my app already requires some stuff, i'll think more about it as compatibility is major point. It seems i cannot vote anything at the moment :( – vlzvl Jan 11 '17 at 16:12
  • @user6096479 "it's window-driver " - this is absolute false. this is user mode function. and you all time indirect **call it** because `FindFirstFile[Ex]` thin shell over `ZwQueryDirectoryFile` . about solution - how I and write - need use notifications (`ReadDirectoryChangesW` ) instead of periodic query – RbMm Jan 11 '17 at 16:28
  • @RbMm, i'll try it. If [NtQueryDirectoryFile](https://msdn.microsoft.com/en-us/library/windows/hardware/ff567047(v=vs.85).aspx) gives this much of performance i'll find out. However, since my clients are mostly noobs in tweaking the system, i hope it doesn't requires additional settings. What is [with special kernel APCs enabled.](https://msdn.microsoft.com/en-us/library/windows/hardware/ff543219(v=vs.85).aspx) ? – vlzvl Jan 11 '17 at 17:10
  • @user6096479 - i not recommended to use, try to use or not use. but what i try explain - this is *usual* user mode api, as and `FindFirstFile[Ex]`or any other. between Nt* and Zw* version in *user* mode no difference - this is alias for same function (but in kernel mode this is different functions). about your question or note about kernel mode APCs - i not understand . you are in user mode and all this absolute not related and not affected your code – RbMm Jan 11 '17 at 17:21
  • @user6096479 - `ZwQueryDirectoryFile` exist in all systems (think even in `nt4` ) not requires additional settings, have better performance even then `FindFirstFile[Ex]` this is user mode api, so call it not harder or easy then any function from kernel32. but it "unsupported" in sense suddenly this disappear say in next build. near more than 17 years exist and not changed, but in case.. – RbMm Jan 11 '17 at 17:25
  • @RbMm, yes i read it's supported from `WinXP+` which is good thing about compatibility. I don't care about it being deprecated or anything :) i just don't want to buffle clients with settings. [ZwQueryDirectoryFile](https://msdn.microsoft.com/en-us/library/windows/hardware/ff567047(v=vs.85).aspx) requires a good read and tests by me as i dont usually get that deep :) and perhaps it proves faster than combination of **FindFirstFileEx + FIND_FIRST_EX_LARGE_FETCH** but with full compatibility. Thanx – vlzvl Jan 11 '17 at 17:37
  • @user6096479 - what you mean under "settings" ? this api not require any "settings." - "requires a good read and tests " - yes, *any* api and code require this :) yes, it faster even `FindFirstFileEx + FIND_FIRST_EX_LARGE_FETCH` *if correct use it* – RbMm Jan 11 '17 at 17:44
  • @RbMm, about _settings_ i was referring to **ZwQueryDirectoryFile** requiring _special kernel APCs enabled._. But since i'm gonna call it in user mode as **NtQueryDirectoryFile** i should not care about settings, right? It's a note in the **[remarks](https://msdn.microsoft.com/en-us/library/windows/hardware/ff567047(v=vs.85).aspx)** – vlzvl Jan 11 '17 at 18:21
  • @user6096479 - we running in **user** mode here. so all APC enable. documentation only for **kernel** side of this function. you should not care about this at all – RbMm Jan 11 '17 at 20:34
  • @user6096479 - and code in user mode always running at IRQL = PASSIVE_LEVEL if so. this note only for kernel mode callers. for user mode all this always true – RbMm Jan 11 '17 at 20:57
  • @RbMm, found seems to be working [sample](https://cboard.cprogramming.com/windows-programming/120136-traversing-directories-using-ntquerydirectoryfile.html) but ccan't understand one thing. Compiled as 32bit binary under win7. It perfectly works on WinXP, but on Win7 I have 25% of launches that outputs only garbage. Debugging shows that sometimes function returns without errors and saying that some data is present but the buffer content is not changed. Do you have experience of using this function? – Alex P. Jan 12 '17 at 17:11
  • @AlexP. - yes, i have huge experience using `ZwQueryDirectoryFile` in user mode - it of course working – RbMm Jan 12 '17 at 17:37
  • @AlexP. - that sample not well design, if want good example, which work without any garbage - http://pastebin.com/9CR2iPam – RbMm Jan 12 '17 at 18:45
  • @RbMm: I ported your version into user mode. It works great. it was interesting why version I found did not work well. That was simply buffer alignment. `Nt/ZwQueryDirectoryFile` requires that `FILE_DIRECTORY_INFORMATION` (for example) is 8-byte aligned. After fixing it works fine too. But It's strange because sometimes it also wokred when buffer was not 8-byte aligned. – Alex P. Jan 13 '17 at 01:14
  • @AlexP. - at first code which i paste - i test and run in **user mode** - it not need to port at all. your sample - how minimum file names handled incorrect - because it not null terminated! then if use `ZwOpenFile` instead `CreateFileW` - we can use relative name! but not every time format full name - great advantage too – RbMm Jan 13 '17 at 09:10
  • @RbMm: You are right. I was confused with "Zw" in the func name. Functions NtQueryDirectoryFile, ZwQueryDirectoryFile can be called from user-mode. When they are calling from user-mode they behave identically (["Using Nt and Zw Versions of the Native System Services Routines"](https://msdn.microsoft.com/en-us/library/windows/hardware/ff565438%28v=vs.85%29.aspx)). – Alex P. Jan 15 '17 at 10:12