0

Having two threads running simultaneously can give strange behavior when writing to and reading from a variable from both threads simultaneously. It can be thread safe, but is not in every case.

Thread safe example: TThread.Terminated

The Boolean Terminated just reads FTerminated, which is set only once and since it is a Boolean, the writing process is atomic. So the value can be read in the MainThread as well as in the thread and is always thread safe to read.

My example: I have a string, which is written only once. Unlike TThread.Terminated, the writing of my string is not atomic, so the reading of it is not thread safe per se. But there may be a thread safe way in a special case: I have a situation where I just want to compare the string to another string. I only do something if they are the same (and it's not critical if they are not equal because the string is just not completely written yet). So I thought about whether this may be thread safe or not. So what happens exactly when the string is written and what may go wrong if I read the string when it's only half way written?

Steps to be done when writing a string:

  1. Reference Count = 1:
    1. Allocate additional memory, if new string is longer than old one
    2. Copy Characters
    3. Set new string length
    4. Deallocate memory, if new string is shorter than old one
  2. Reference Count > 1 (due to copy-on-write semantics a new string instance is needed):
    1. Allocate memory for new string instance
    2. Copy characters to new location and set length of the string
    3. Locate string instance pointer to new location

Under what circumstances is it safe to read the string which is written to in just this same moment?

  1. Reference Count = 1:
    1. It is only (and in this case always) safe to read if the order of steps is as listed above and reading the string right before its length is set only gives the set length back (not all the allocated bytes)
  2. Reference Count > 1:
    1. It is only (and in this case always) safe to read if the pointer to the string is set as the last step (as setting this pointer is an atomic operation) or if length is initialized to 0 before the pointer to the string is set and the conditions for the case "Reference Count = 1" apply to the new string

Question to the ones who have such deep-knowledge: Are my assumptions true? If yes, can I rely on this safely? Or is it a such bad idea to rely on this implementation specifics that it's not even worth to think about all this and just not read strings unprotectedly when they are written to in another thread?

RSE
  • 322
  • 1
  • 10
  • 1
    For a Delphi string, it's never threadsafe if you have one thread writing and one thread reading. Protect with a lock. The `TThreadsafe` class in my answer here does the job: http://stackoverflow.com/questions/19703274/generic-threadsafe-property – David Heffernan Oct 15 '14 at 14:56
  • @David sure, that would be the easy approach, but I wanted to dive deeper into it. Why synchronizing to a thread if I don't have to? Can you explain why it is not threadsafe in my outlined case? – RSE Oct 15 '14 at 15:07
  • 1
    If you don't need to synchronize, then for sure don't. I don't see this as a choice between easy or hard. I'm not proposing that you opt for the easy approach. I think that correctness should be the goal here. Wouldn't you agree? – David Heffernan Oct 15 '14 at 15:07
  • Since you write to the string exactly once you can avoid a lock by adding an extra boolean. Write the string, and set the boolean to true. If you want to read the string, then check the boolean before attempting to read the string. This works, but adds complexity. For no benefit whatsoever. A lock like a critical section is only expensive when there is contention. Since you write exactly once there will be effectively no contention. – David Heffernan Oct 15 '14 at 15:16
  • @David there would possibly be contention, because I do multiple attempts in reading this string until it is equal. This could content with the moment of writing it. Whatever, this could be called microoptimization and I'm only going into this so deep just to learn it. In summary you are saying that reading that string without any safeguard could lead to reading uninitialized data or even an access violation? – RSE Oct 15 '14 at 15:36
  • There is effectively no discernible contention because you write exactly once. That means there can be one short instant in time with contention. You can do it lock free with a flag if you want. As for the question that you asked in that comment, read my very first comment again. What you are proposing is not threadsafe. End of story. – David Heffernan Oct 15 '14 at 15:42
  • There are also many different ways to solve race and contention issues. The best is often to avoid using shared data. Waiting for an event could be an option. Message passing can be good. A good old fashioned lock could be good. Do you even have a performance problem? – David Heffernan Oct 15 '14 at 15:44
  • I don't have a performance problem. I just wanted to know why it is not thread safe, what can happen. Just before I wrote this question I found that I'm reading a string which may not yet be set in the thread. If it was thread safe I could just leave it as is, but as it turned out I cannot. I just have to make sure that I don't read it before it is set - not really a problem. As for the "what could happen"-part: That's just my curiosity. So I would gladly accept your answer if it wasn't just a comment to my question... – RSE Oct 15 '14 at 15:58

2 Answers2

3

Delphi strings are "thread-safe" only in a sense that a string's reference count is guarantied to be valid in a multithreaded code.

Copy-On-Write of Delphi strings is not a threadsafe operation; if you need a multithreaded read/write access to the same string you generally should use some synchronization, otherwise you are potentially in trouble.

kludg
  • 27,213
  • 5
  • 67
  • 118
1

Example of what could happen without any lock.

String is being written: it should become bigger than it was, so new memory is allocated. But pointer is not yet modified, it points to old string.

At the same time reading thread got a pointer and began to read old string.

Context switched again to writing thread. It changed pointer, so now it is valid. Old string got refcount 0 and was immediately freed.

Context switch again: reading thread continues to process old string, but now it is access to deallocated memory which may easily result in access violation.

Yuriy Afanasenkov
  • 1,440
  • 8
  • 12