0

I want to discuss the following structure in golang from this link

// Local per-P Pool appendix.
    57  type poolLocal struct {
    58      private interface{}   // Can be used only by the respective P.
    59      shared  []interface{} // Can be used by any P.
    60      Mutex                 // Protects shared.
    61      pad     [128]byte     // Prevents false sharing.
    62  }

The above structure can be accessed only one thread at a time as Mutex is used. The coder will Lock the structure in the beginning of a thread and unlock it when the thread completes. So the memory is not shared between threads. So no more than one core will have access to the memory. So, by my understanding, false sharing cannot happen here. If false sharing cannot happen, why did the coder pad the structure with extra bytes (pad [128]byte) ? Is my understanding wrong?

user3219492
  • 774
  • 1
  • 10
  • 18
  • 5
    Yes, your understanding is wrong: "False sharing" has nothing to do with several threads accessing the same memory (race condition if one writes) but with processor cache lines: Just google for "false sharing": – Volker Jan 17 '17 at 13:00
  • @Volker Thabk you. I understood. Let me get clarified. Let there be two threads t1 and t2 running on cores C1 and C2. Let t1 be allowed to run first (now t2 waits as there is muted lock). Let t1 write data into c1's L1 cache(the cache line which stores our structure). Now t2 is ready in C2. But L2 Cache in C2 has to be updated(by MESI protocol)to the new values. After this, t2 can proceed with its job. This is where false sharing occurs. Am I correct? – user3219492 Jan 17 '17 at 17:25
  • When you do padding, each thread will be allocated a cache line in L1 which will be pointing to different memory locations in L2 or higher memory. As each thread is working on different memory locations, they are not shared and so no MESI and no false sharing. Am I correct? – user3219492 Jan 17 '17 at 17:33
  • "Premature optimization is the root of all evil." - a famous quote by Sir Tony Hoare (popularized by Donald Knuth). Learn from it. And stop optimizing cache lines unless you have a perfect code and measure that this is your current #1 performance bottleneck. – k3a Aug 01 '20 at 17:35

1 Answers1

2

Memory locations on the same cache line are subject to false-sharing, which is very bad for performance. Cache line size ranges from 32 to 128 bytes, depending on processor model. 128 byte pad will reduce chance for same cache line being used by different processes and that improvesthe performace

as i see it, the following would be better as it would be more explicit

type poolLocal struct {
      _     [64]byte     // Prevents false sharing.
      private interface{}   // Can be used only by the respective P.
      shared  []interface{} // Can be used by any P.
      Mutex                 // Protects shared.
      _     [64]byte     // Prevents false sharing.
}
  • Ooops. I don't get how this is the answer to my question. My question is, if only one thread has access to a memory, false sharing cannot happen and then why the programmer has added extra bits. I now got it cleared myself (look at my comments to question). The aim is to make the size of the structure a multiple of the cache line. Why will it matter if it is padded at the start and/or ends? – user3219492 Jan 18 '17 at 06:12
  • moreover according to this link: http://www.geeksforgeeks.org/structure-member-alignment-padding-and-data-packing/ the compiler seems to change the order of the variables declared. So your style of padding and the one that has posted in question makes no difference – user3219492 Jan 18 '17 at 06:14
  • @user3219492 `false-sharing` has nothing to do with threads accessing same memmory. You may think of it as a performance problem due to two process sharing same `cache line` – Sarath Sadasivan Pillai Jan 18 '17 at 06:15
  • yes. I got my doubt cleared (comments in question). My point is, the one you have posted is not the answer for my doubt I feel. – user3219492 Jan 18 '17 at 06:20
  • @user3219492 Compiler changing the order is specific to languages even ANCI-C does not specify exactly how it does it.In i have often seen people using padding in this manner in golang – Sarath Sadasivan Pillai Jan 18 '17 at 06:22
  • @user3219492 this thread may help you understand it better https://www.codeproject.com/Articles/51553/Concurrency-Hazards-False-Sharing#heading0005. Check this gist also https://github.com/Workiva/go-datastructures/blob/master/queue/ring.go – Sarath Sadasivan Pillai Jan 18 '17 at 06:47