6

Visual Studio's fread "locks out other threads." There is an alternate version _fread_nolock, which reads "without locking other threads", which should only be used "in thread-safe contexts such as single-threaded applications or where the calling scope already handles thread isolation."

Even after reading other somewhat relevant discussions on the two, I'm confused if the locking fread implements is on a specific FILE struct, a specific actual file, or on all fread calls on totally different files.

If you use the nolock versions, what level of locking do you need to provide? Can multiple threads in parallel be reading separate files without any locking? Can multiple threads in parallel be writing separate files without any locking? Or are there global or static variables involved that would be corrupted?

So, by using the nolock versions, are you able to potentially achieve better I/O throughput (if you aren't needlessly moving heads, like reading off separate drives, or a SSD drive), or is the potential gain just reducing redundant locks to a single lock (which should be negligible.)

Does VS' ifstream.read function work just like the regular fread? (I don't see a nolock version of it.)

user1902689
  • 1,655
  • 2
  • 22
  • 33
  • The C runtime library dates back from a time long, long before operating systems supported threads. The spec was never updated to say what *should* happen when two threads call fread() on the same file. So library writers had to fend for themselves to make the old spec work. It is not like the CRT gave programmers another way. The odds that you are *actually* ahead by trying to bypass the lock is very low, I/O is quite slow. That however is not true in all cases, locale is punishingly expensive for example. The fate of most any program that tries to do it right is to avoid the CRT. – Hans Passant Apr 25 '15 at 23:35

2 Answers2

3

The MS standard library implementation fully supports multi-threading. The C++ standard explain this requirement:

27.2.3: Concurrent access to a stream object, stream buffer object, or C Library stream by multiple threads may result in a data race unless otherwise specified.

If one thread makes a library call a that writes a value to a stream and, as a result, another thread reads this value from the stream through a library call b such that this does not result in a data race, then a’s write synchronizes with b’s read.

This means that if you write on a stream, a locking (not file locking, but concurrent access locking to the in-memory stream data structure) is done, to be sure that concurrency is well manageged for all the other threads using the same stream.

This locking overhead is always there, even if not needed. This could have a performance aspect, according to Microsoft:

the performance of the multithreaded libraries has been improved and is close to the performance of the now-eliminated single-threaded libraries. For those situations when even higher performance is required, there are several new features.

This is why _nolock functions are provided. They access the stream directly without thread locking. It must be used with extreme care, for example:

  • if your application is single threaded (another process using the same stream has its own data structure, and OS manageds concurrency here)
  • if you're sure that no two threads use the same stream (for example if you have only one reader thread and writing is done outside your porgramme).
  • if you have other synchronisation mechasnism that protect a critical section of your code. For example, if you use a mutex lock, or an thread safe non blocking algorithm that makes use of atomics.

In such cases, the additional lock for stream access is not needed/redundant. For file intensive functions, it could be worth using the no_lock then.

Note: as you've pointed out: it's only worth using the nolock for intensive file accesses where you make millions of accesses.

Christophe
  • 68,716
  • 7
  • 72
  • 138
0

fread_no_lock() appears to be used once you make sure that the file is locked with an external mechanism (some form of mutex, probably), and then you use it to reduce overhead: related: What's the intended use of _fread_nolock, _fseek_nolock?

This may also answer any further questions you might have: it may or may not be possible for your hard-drive to actually perform more than I/O operation at the same time depending on what type of hard drive you have: https://superuser.com/questions/252959/which-is-faster-copying-everything-at-once-or-one-thing-at-a-time

Community
  • 1
  • 1
CinchBlue
  • 6,046
  • 1
  • 27
  • 58
  • I saw "What's the intended use of..." question before posting, and remained confused. The O/P says "the" (thinking means fread) function blocks re-enterant calls, allowing one thread to be in the function as a whole at a time. But, one answer indicates the locking is instead at the FILE* level. Another answer says the thread safe versions are re-entrant, but you can't call two with the same FILE*. Another answer says better performance using _nolock versions, but doesn't mention if actual disk I/O is higher, or if it's just bypassing redundant locks. It left me with more questions. – user1902689 Apr 25 '15 at 21:53