11

I was reading the header files of the pthreads library and found this particular definition of the mutex (and other types) in bits/pthreadtypes.h:

typedef union
{
  struct __pthread_mutex_s
  {
    int __lock;
    unsigned int __count;
    int __owner;
    /* KIND must stay at this position in the structure to maintain
       binary compatibility.  */
    int __kind;
    unsigned int __nusers;
    __extension__ union
    {
      int __spins;
      __pthread_slist_t __list;
    };
  } __data;
  char __size[__SIZEOF_PTHREAD_MUTEX_T];
  long int __align;
} pthread_mutex_t;

It's not exactly like this but I've simplified it for clarity. Creating a struct with two different definitions in the header and in the implementation file, being the implementation the real struct definition and the header just a character buffer of the size of the real struct, is used as a technique to hide the implementation (opaque type) but still allocate the correct amount of memory when calling malloc or allocating an object in the stack.

This particular implementation is using a union and still exposing both the definition of the struct and the character buffer, but doesn't seem to provide any benefits in terms of hiding the type as the struct is still exposed and the binary compatibility is still dependent on the structures being unchanged.

  1. Why are the types defined in pthreads following this pattern?
  2. What are the benefits of having opaque types if you're not providing binary compatibility (as in the opaque pointer pattern)? I understand security is one of them as you aren't allowing the user to tamper with the fields of the struct, but is there anything else?
  3. Are pthread types exposed mostly to allow static initializations or is there any other specific reason for this?
  4. Would it be feasible a pthreads implementation following the opaque pointer pattern (i.e. not exposing any types at all and not allowing static initializations)? or more specifically, is there any situation where a problem can only be solved with static initializations?
  5. And totally unrelated, are there "before main" threads in C?
timrau
  • 22,578
  • 4
  • 51
  • 64
AnilM3
  • 261
  • 1
  • 9
  • 1
    This looks to me like it is using the char buffer to hide the implementation details while providing a way for memory allocation to work correctly. So all the implementation details are in a memory area that is smaller or the same size as the char buffer. This allows pthreads to be used with a stack allocated struct as well as a heap allocated struct providing flexibility in how the memory for the pthreads management data struct is allocated. The main reason for opaque types is to prevent people from developing a dependency on a particular memory layout. – Richard Chambers Jun 09 '14 at 15:40
  • Could you also post the header file declaration? – Lundin Jun 13 '14 at 09:25

4 Answers4

5

My take is that __size and __align fields specify (guess what :-) ) the size and alignment of the structure independently of the __data structure. So, the data can be of a less size and have less alignment requirements, it can be modified freely without breaking these basic assumptions about it. And vice-versa, these basic characteristics can be changed without altering the data structure, like here.

It is important to note that if the size of the __data becomes bigger than specified by __SIZEOF_PTHREAD_MUTEX_T, an assertion fails in __pthread_mutex_init():

assert (sizeof (pthread_mutex_t) <= __SIZEOF_PTHREAD_MUTEX_T);

Consider this assertion as an essential part of this approach.

So, the conclusion is that this was done not to hide the implementation details, but to make the data structure more predictable and manageable. It is very important for a widely-used library which should care a lot about backward compatibility and performance impact to other codes from the changes which can be made to this structure.

Anton
  • 6,349
  • 1
  • 25
  • 53
  • So this was made to keep the `sizeof( pthread_mutex_t )` stay the same size, regardless of the platform? – this Jun 10 '14 at 22:28
  • @self., it'd be oversimplification to say just 'yes'. Not only the platforms can differ (e.g. configurations, versions) and not only the size (also the alignment). – Anton Jun 11 '14 at 09:21
  • 1
    Can you provide a specific scenario in which this alignment and size constraints are required? Also, can you provide an answer to the rest of the questions? – AnilM3 Jun 13 '14 at 06:40
  • @Anton Sorry you didn't get the full bounty; due to the lack of answer updates I missed the deadline. – this Jun 17 '14 at 21:27
0

Standards committees such as the IEEE and POSIX develop and evolve standards with iterations that provide more functionality or to correct problems with previous versions of the standards. This process is driven by the needs of people who have problem domain software needs as well as by the vendors of software products supporting those people. Typically the implementation of a standard will vary between vendors to some degree. Like any other software, different people provide differences in implementation depending on the target environment as well as their own skills and knowledge. However as the standard matures there is a kind of Darwinian selection in which there is an agreement on best practices and the various implementations begin to converge.

The first versions of a pthreads POSIX library was in the 1990s targeting UNIX style operating system environments for instance see POSIX. 4: Programming for the Real World and see also PThreads Primer: A guide to Multithreaded Programming. The ideas and concepts for the library originated from work done earlier in an attempt to provide a co-routine or thread type of functionality which worked at a finer level than the operating system process level to reduce the overhead that creating, managing, and destroying processes involved. There were two major approaches to threading, user level with little kernel support and kernel level depending on the operating system to provide the thread management, with somewhat different capabilities such as pre-emptive thread switching or not being available.

In addition there were also the needs of tool makers such as debuggers to provide support for working in a multi-threaded environment and being able to see thread state and to identify specific threads.

There are several reasons for using an opaque type within the API for a library. The primary reason is to allow the developers of the library the flexibility to modify the type without causing problems for users of the library. There are several ways of creating opaque types in C.

One way is to require users of the API to use a pointer to some memory area that is managed by the API library. You can see examples of this approach in the Standard C Library with the file access functions such as fopen() which returns a pointer to a FILE type.

While this accomplishes the goal of creating an opaque type, it requires the API library to manage the memory allocation. Since it is pointers, you can run into problems of memory being allocated and never released or of attempting to use a pointer whose memory has already been released. It also means that specialized applications on specialized hardware may have a difficult time porting the functionality for instance to a specialized sensor with bare bones support which does not include a memory allocator. This kind of hidden overhead can also affect specialized applications with limited resources and being able to predict or model the resources used by an application.

A second way is to provide to the users of the API a data struct that is the same size as the actual data struct used by the API but that uses a char buffer to allocate the memory. This approach hides the details of the memory layout, since all the user of the API sees is a single char buffer or array, yet it also allocates the correct amount of memory that is used by the API. The API then has its own struct that lays out how the memory is actually used and the API does a pointer conversion internally to change the struct used to access the memory.

This second approach provides a couple of nice benefits. First of all, the memory used by the API is now managed by the user of the API and not the library itself. The user of the API can decide if they want to use stack allocation or global static allocation or some other memory allocation such as malloc(). The user of the API can decide if they want to wrap the memory allocation in some kind of resource tracking such as a reference counting or some other management that the user wants to do on their side (though this could also be done with pointer opaque types as well). This approach also allows the user of the API to have a better idea of memory consumption and to model memory consumption for specialized applications on specialized hardware.

The API designer could also provide some types of data to the user of the API which might be handy such as status information. The goal of this status information is to allow the user of the API to query what are tantamount to read only members of the struct directly rather than going through the overhead of some kind of a helper function in the interests of efficiency. While the members are not specified as const (to encourage the C compiler to reference the actual member rather than caching the value at some point in time depending on it to not change), the API may update the fields during operations to provide information to the user of the API while not depending on the values of those fields for its own use.

However any such data fields run the risk of introducing problems with backwards compatibility as well as changes introducing memory layout problems. A C compiler may introduce padding between the members of a struct in order to provide for efficient machine instructions when loading and storing data into those members or due to CPU architecture requiring some kind of a starting memory address boundary for some kinds of instructions.

Specifically for the pthreads library, we have the influence of UNIX style C programming of the 1980s and 1990s which tended to have open and visible data structures and header files allowing programmers to read the struct definitions and defined constants with comments since much of the available documentation was the source.

A brief example of an opaque struct would be as follows. There is the include file, thing.h, which contains the opaque type and which is included by anyone using the API. Then there is a library whose source file, thing.c, contains the actual struct used.

thing.h may look like

#define MY_THING_SIZE  256

typedef struct {
    char  array[MY_THING_SIZE];
} MyThing;

int DoMyThing (MyThing *pMyThing, int stuff);

Then in the implementation file, thing.c, you might have source like the following

typedef struct {
    int   thingyone;
    int   thingytwo;
    char  aszName[32];
} RealMyThing;

int DoMyThing (MyThing *pMyThing, int stuff)
{
    RealMyThing *pReal = (RealMyThing *)pMyThing;

    // do stuff with the real memory layout of MyThing
    return 0;
}

Concerning "before main" threads

When an application using the C run time is started, the loader uses the entry point for the C run time as the application starting place. The C run time then performs the initialization and environmental setup that it needs to do and then invokes the designated entry point for the actual application. Historically this designated entry point is the function main() however what the C run time uses can vary between operating systems and development environments. For instance for a Windows GUI application the designated entry point is WinMain() (see WinMain entry point) rather than main().

It is up to the C run time to determine the conditions under which the designated entry point for the application is called. Whether there are "pre-main" threads running will depend on the C run time and the target environment.

With a Windows application using Active-X controls with their own message pump there could well be "pre-main" threads. I work with a large Windows application that uses several controls providing various kinds of device interfaces and when I look in the debugger, I can see a number of threads which the source of my application does not create with a specific create thread call. These threads are started by the run time as the Active-X controls used are loaded in and started.

Richard Chambers
  • 16,643
  • 4
  • 81
  • 106
  • This answer completely fails to address the question why a library would expose its internal fields. The equivalent of `thing.h` **doesn't** look like that, it is a union of a padding array and the actual fields. The question asks about the possible reasoning for that, and your answer unfortunately provides none. – user4815162342 Jun 09 '14 at 18:13
  • @user4815162342, the pthread struct looks to me like an evolving struct that started off simple with exposed fields that was then kept for backwards compatibility while making additional changes using an opaque data area. Lots of old software exposed internal fields just because it was simpler and easier in a simpler and easier world. – Richard Chambers Jun 09 '14 at 18:26
0

Yes, normally an implementation would hide most of the details of such a struct, either like this (where presumably __SIZEOF_PTHREAD_MUTEX_T is defined in some previously included system header file):

typedef union
{
    char      __size[__SIZEOF_PTHREAD_MUTEX_T];
    long int  __align;
} pthread_mutex_t;

Or like this:

typedef union
{
#if __COMPILE_FOR_SYSTEM
    struct __pthread_mutex_s
    {
        ...internal struct member declarations...
    } __data;
#endif
    char __size[__SIZEOF_PTHREAD_MUTEX_T];
    long int __align;
} pthread_mutex_t;

The first form completely isolates the internals of the struct declaration from client code. Getting access to the actual internals of the struct would then require including a system kernel header file with the full struct declaration, something that regular client code would not normally have access to. Since client code should be dealing only with pointers to this struct/union type, the actual members can remain hidden from all client code.

The second form exposes the struct internals to the programmer, but not to the compiler (presumably the __COMPILE_FOR_SYSTEM is defined in some other system header file that would only be used when compiling kernel code).

The question remains, then, why the implementers of this library chose to leave the internal details visible to the compiler? After all, it would seem that the second solution would be very easy to provide.

My guess is that either the implementers simply forgot about it in this particular case. Or perhaps their source and header file code is arranged imperfectly, so that they need to keep the members exposed in order for their compiles to work (but this is rather doubtful).

Sorry that this does not really answer your question.

David R Tribble
  • 11,918
  • 5
  • 42
  • 52
0
  1. I've seen a union of a struct and a data buffer before in order to support a double word compare and swap instructions(which have specific alignment requirements); this instruction may be what they are using to implement the mutex's functions.

  2. To allow implementers more freedom to implement their vision at of a fast and efficient pthreads library while still providing the end-user a unified inference.

  3. Main is an intrinsic concept, normally a function is called before main to setup standard file descriptors among other things. In GCC you can add the attribute '__attribute__ ((constructor))' to a function and it'll be called before main(it could then launch a bunch of threads then exit). However, a root process/thread that spawns other processes or threads always has to come first(in case that was your question).

ballard26
  • 37
  • 3