4

I'm trying to create a system call handler, and I'm not sure how to store it.

I'm using the following typedef to store a (void *) pointer, which should receive the address of the function and an integer arg_no representing the number of arguments. Then, I create an array of this type.

typedef struct
{
  void *foo;
  int arg_no;
}td_sys_call_handler;

td_sys_call_handler ish[SYSCALL_HANDLER_NUM];

I'm trying to initialize the array in the following manner.

  ish[0].foo  = void     (*halt) (void);                  ish[0].arg_no  = 0;
  ish[1].foo  = void     (*exit) (int status) NO_RETURN;  ish[1].arg_no  = 1;
  ish[2].foo  = pid_t    (*exec) (const char *file);      ish[2].arg_no  = 1;
  ish[3].foo  = int      (*wait) (pid_t);                 ish[3].arg_no  = 1;
  ish[4].foo  = bool     (*create) (const char *file, unsigned initial_size);
                                                          ish[4].arg_no  = 2;
  ish[5].foo  = bool     (*remove) (const char *file);    ish[5].arg_no  = 1;
  ish[6].foo  = int      (*open) (const char *file);      ish[6].arg_no  = 1;
  ish[7].foo  = int      (*filesize) (int fd);            ish[7].arg_no  = 1;
  ish[8].foo  = int      (*read) (int fd, void *buffer, unsigned length);
                                                          ish[8].arg_no  = 3;
  ish[9].foo  = int      (*write) (int fd, const void *buffer, unsigned length);
                                                          ish[9].arg_no  = 3;
  ish[10].foo = void     (*seek) (int fd, unsigned position);
                                                          ish[10].arg_no = 2;
  ish[11].foo = unsigned (*tell) (int fd);                ish[11].arg_no = 1;

But all the assignments from the function pointer to the void pointer produce the following error:

../../userprog/syscall.c: In function ‘syscall_init’:
../../userprog/syscall.c:76:17: error: expected expression before ‘void’
../../userprog/syscall.c:77:17: error: expected expression before ‘void’
../../userprog/syscall.c:78:17: error: expected expression before ‘pid_t’
../../userprog/syscall.c:79:17: error: expected expression before ‘int’
../../userprog/syscall.c:80:17: error: expected expression before ‘_Bool’
../../userprog/syscall.c:82:17: error: expected expression before ‘_Bool’
../../userprog/syscall.c:83:17: error: expected expression before ‘int’
../../userprog/syscall.c:84:17: error: expected expression before ‘int’
../../userprog/syscall.c:85:17: error: expected expression before ‘int’
../../userprog/syscall.c:87:17: error: expected expression before ‘int’
../../userprog/syscall.c:89:17: error: expected expression before ‘void’
../../userprog/syscall.c:91:17: error: expected expression before ‘unsigned’

I was under the impression that void* is the only instance of polymorphism in the language and that it can point to anything. However, it appears that I'm wrong.

So which is the type of the pointer which can store the address of any function type?

Also, can you give me a good reference about C polymorphism? I've looked in many books but as far as I've seen the polymorphism chapter is very thin.

Thank you.

bsky
  • 19,326
  • 49
  • 155
  • 270

6 Answers6

7

Yes, you are wrong.

void * pointers can point at any kind of data, but in C code (functions) are not data.

It's not valid to even cast between void * and function pointers: even though on most contemporary computers it will work as expected, the language does not guarantee that.

I don't understand from your code how you intended the "overloading" to be used in practice, how do you expect to call through the foo pointer? Just having the expected number of arguments is not enough, arguments have types and thus are handled differently in the function call.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • @problemPotato This will only work for functions returning `void`, so it will fail for e.g. `wait`, which returns an `int`. The OP needs to use a union at least for the different return types he must handle. I started to write an answer describing this in some detail, but then noticed the different parameter types of the function which make it impossible to portably invoke the stored function without having descriptions for both the return type and the type of each parameter, as well as a huge `switch` with all combinations thereof. – user4815162342 Mar 07 '14 at 15:59
  • 1
    Processors using a [Harvard design](https://en.wikipedia.org/wiki/Harvard_architecture) are a perfect example. As code and data are in completely disjoint address spaces, it is not possible to hold a function pointer in a data pointer. Atmel processors (that power arduinos among other things) use that scheme. – spectras Sep 28 '17 at 23:57
  • @spectras Good point, absolutely. I guess it's still fixable for a compiler implementor, at the cost of making `void *` wide enough to contain an extra bit that tells if its pointing at data or code. Not that they would need to, of course. – unwind Sep 29 '17 at 08:00
  • That would come at a huge cost: every single pointer access would have to check the bit. That means the test itself, a conditional branch, instructions to access data space, an unconditional jump, instructions to access program space (they are entirely different instructions). – spectras Sep 29 '17 at 08:34
5

The notation you need casts the system call function pointer to void *:

ish[0].foo  = (void *)halt;

The C standard does not guarantee that pointers to functions will fit into pointers to data such as void *; fortunately for you, POSIX steps in and does guarantee that pointers to functions are the same size as pointers to data.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • In POSIX 2008, this requirement is somewhat hidden in [dlsym](http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html) (Application Usage). §2.12.3 went missing, I think. – Michael Foukarakis Mar 07 '14 at 15:30
  • @MichaelFoukarakis: That's very odd. Section 2.13.3 was there previously — I've quoted it frequently on SO, and I got the text from [Data types](http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_12), copying the material from the online reference. The rationale still mentions it: [pointer types](http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xsh_chap02.html#tag_22_02_12_03) (poorly worded, but…). So, either POSIX 2013 changed the rules without rationalizing the change or there's a glitch on the web site. But I agree that the info is missing today. – Jonathan Leffler Mar 07 '14 at 15:36
  • @MichaelFoukarakis: One of a number of posts I've made quoting the POSIX standard of the time: [Can the size of pointers vary depending on what's pointed to?](http://stackoverflow.com/questions/1473935/can-the-size-of-pointers-vary-depending-on-whats-pointed-to/1474667#1474667) – Jonathan Leffler Mar 07 '14 at 15:41
  • 1
    I've submitted a request to OpenGroup requesting clarification whether the omission of 2.13.3 is missing by accident or on purpose. I don't know how quickly I'll get a response. – Jonathan Leffler Mar 07 '14 at 16:25
  • Apparently there was an issue filed ([#74](http://austingroupbugs.net/view.php?id=74)) to remove that text. – Michael Foukarakis Mar 11 '14 at 12:09
  • @MichaelFoukarakis: Yes; I've got all the information, and need to make it available. I am still trying to understand how the requirement on `dlsym()` that you can retrieve pointers to variables and pointers to functions would work on a system where a pointer to function is not the same size as a pointer to variable. I've been busy since I got the information, and I think it is going to need to be a new self-answered question (and one part of the process will need to be tracking down questions where I've quoted the older version of the POSIX standard; there are quite a few of those, I think). – Jonathan Leffler Mar 12 '14 at 01:02
2

Your syntax is wrong. You should declare your function pointer first. Then you can use the address of the function pointer to assign to the pointer.

void (*halt) (void) = halt_sys_call_function;
ish[0].foo  = &halt; ish[0].arg_no  = 0;

C doesn't support traditional inheritance relationships in a direct way, but it does guarantee that the address of a structure is also the address of the first member of the structure. This can be used to emulate polymorphism in C. I described a similar approach in an answer I wrote about dynamic dispatch in C.

Community
  • 1
  • 1
jxh
  • 69,070
  • 8
  • 110
  • 193
  • It is not entirely clear, but I'd venture to guess that his `halt_sys_call_function` is *already* named `halt`, so the first line is not necessary. – user4815162342 Mar 07 '14 at 15:29
  • @user4815162342: He mixed function pointer declaration into the assignment. If the intention is to declare a function pointer, then it needs to be its own declaration, and it needs a value assigned to it. – jxh Mar 07 '14 at 15:31
  • It is my impression that the OP is simply confused and simply intends to store the address of the function to the `void *` (which is not standard-conforming, as others pointed out). In that case, there is no gain from an intermediate function pointer variable. – user4815162342 Mar 07 '14 at 15:56
  • @user4815162342: My answer fixes that by using the void pointer to store the address of the function pointer variable. – jxh Mar 07 '14 at 16:31
  • That's actually a neat idea, +1. You might want to mention that `halt` should be of static storage, or that its lifetime should not exceed that of `ish`. – user4815162342 Mar 07 '14 at 18:09
1

Consider a struct formatted to hold each function specifically:

typedef struct 
{

  void     (*halt) (void);                  
  void     (*exit) (int status);  
  pid_t    (*exec) (const char *file);      
  int      (*wait) (pid_t);                 
  bool     (*create) (const char *file, unsigned initial_size);
  bool     (*remove) (const char *file);    
  int      (*open) (const char *file);      
  int      (*filesize) (int fd);            
  int      (*read) (int fd, void *buffer, unsigned length);
  int      (*write) (int fd, const void *buffer, unsigned length);  
  void     (*seek) (int fd, unsigned position);   
  unsigned (*tell) (int fd);                

} myFuncs;

OR

This is messy and VERY unmaintable, but if you did cast each pointer to a void*, using void *addressOfWait = (void*)&wait;, then you could re-cast to the correct function pointer type before calling:

int (*waitFunctionPointer)(pid_t) = addressOfWait;

Then you could call that pointer:

waitFunctionPointer((pid_t) 1111); //wait for process with pid of 1111
problemPotato
  • 589
  • 3
  • 8
1

I'll ask for @problemPotato's forgiveness for filching his structure definition:

typedef struct 
{
   void     (*halt) (void);                  
   void     (*exit) (int status);  
   pid_t    (*exec) (const char *file);      
   int      (*wait) (pid_t);                 
   bool     (*create) (const char *file, unsigned initial_size);
   bool     (*remove) (const char *file);    
   int      (*open) (const char *file);      
   int      (*filesize) (int fd);            
   int      (*read) (int fd, void *buffer, unsigned length);
   int      (*write) (int fd, const void *buffer, unsigned length);  
   void     (*seek) (int fd, unsigned position);   
   unsigned (*tell) (int fd);                
} fs_ops;

Say you have matching functions, declared like:

int      ext5_open(const char * file);
unsigned ext5_tell (int fd);

then you can define and initialize a variable like (the bare name of the function is a pointer to it):

fs_ops ext5_ops = {
   .open = ext5_open,
   .tell = ext5_tell,
};

Fields that aren't initialized get NULL (i.e., pointer to no function). You can change the value of a field, ask if it is set (if(ext5_ops.seek == NULL) ...), and call the function:

retval = ext5_ops.(*ext5_open)("/tmp/junk");

(the parenteses around (*ext5_open) are because * (pointer indirection) binds less strongly than function call).

vonbrand
  • 11,412
  • 8
  • 32
  • 52
  • Thank you. Do you know by any chance how I can abstract parameter conversion as well? I am receiving all parameters on the stack, and they all occupy 4 bytes. Currently I am converting them to (void *), but I don't know how to convert them back to their types without a switch statement. – bsky Mar 07 '14 at 16:59
  • Switch on what? If you get bytes on the stack, that's all you have. Besides, your types might be 4 bytes wide today, tomorrow GCC decides to use 2 or 8, or the next machine or compiler, and all hell breaks loose. See how to take advantage of the compiler's type handling. If not, rethink what you are doing. It smells awfuly of OOP (classes, virtual functions, function overloading), and in that case why not use C++ (or Objective C, or another language which supports that natively)? Select the right tool for the task at hand. More to learn/redesign/rewrite; but it hurts less in the long run. – vonbrand Mar 07 '14 at 17:57
1

A function pointer can be converted into a void *, but it's a little trickier to convert it back to correct function-pointer type in order to call it. It should be possible by using a union. You'll need a separate union-member of the correct type for type of function that you want to store. And, as user4815162342 notes in a comment, you'll need to manage all the various combinations, probably with an enum.

typedef struct
{
  union {
    void *vp;
    void (*v__v)(void);
    void (*v__i)(int);
    pid_t (*pid__ccp)(const char *);
    int (*i__pid)(pid_t);
    bool (*b__ccp_u)(const char *, unsigned);
    bool (*b__ccp)(const char *);
    int (*i__ccp)(const char *);
    int (*i__i)(int);
    int (*i__i_vp_u)(int, void *, unsigned);
    int (*i__i_cvp_u)(int, const void *, unsigned);
    void (*v__i_u)(int, unsigned);
    unsigned (*u__i)(int);
  } fp;
  int arg_no;
}td_sys_call_handler;

The idea here is to try to encode the types into the identifiers, as a kind of "apps-Hungarian". This way, the meaning of any of these identifiers is directly visible.

It may be easier to generate these pointers and the associated enum at the same time. I think the easiest way to manage this part is with my favorite trick, X-Macros. Warning: it just gets more and more weird.

#define function_types(_) \
    _(v__v, void, void) \
    _(v__i, void, int) \
    _(pid_ccp, pid_t, const char *) \
    _(i__pid, int, pid_t) \
    _(b__ccp_u, const char *, unsigned) \
    _(b__ccp, const char *) \
    _(i__ccp, const char *) \
    _(i__i, int) \
    _(i__i_vp_u, int, void *, unsigned) \
    _(i__i_cvp_u, int, const void *, unsigned) \
    _(v__i_u, int, unsigned) \
    _(u__i, unsigned, int) \
    /* end function_types */

This "master"-macro is a comma separated table of tokens which is passed, row by row, to the _ underscore macro, which is passed-in.

Now the struct type can be constructed by writing additional macros to use the rows, these are passed-in as _ to the table macro to instantiate the template:

#define create_function_pointer(id, ret, ...) \
    ret (*id)(__VA_ARGS__);

#define create_function_type_id(id, ret, ...) \
    f__ ## id

typedef struct {
    union {
        void *vp;
        function_types(create_function_pointer)
    } fp;
    int arg_no;
    enum {
        function_types(create_function_type_id)
    } type;
} td_sys_call_handler;

Now an array of these structs can be populated:

td_sys_call_handler ish[SYSCALL_HANDLER_NUM];
int i=0;

ish[i++]  = (td_sys_call_handler){ halt,     0, f__v__v };
ish[i++]  = (td_sys_call_handler){ exit,     1, f__v__i };
ish[i++]  = (td_sys_call_handler){ exec,     1, f__pid__ccp };
ish[i++]  = (td_sys_call_handler){ wait,     1, f__i__pid };
ish[i++]  = (td_sys_call_handler){ create,   2, f__b__ccp_u };
ish[i++]  = (td_sys_call_handler){ remove,   1, f__b__ccp };
ish[i++]  = (td_sys_call_handler){ open,     1, f__i__ccp };
ish[i++]  = (td_sys_call_handler){ filesize, 1, f__i__i };
ish[i++]  = (td_sys_call_handler){ read,     3, f__i__i_vp_u };
ish[i++]  = (td_sys_call_handler){ write,    3, f__i__i_cvp_u };
ish[i++]  = (td_sys_call_handler){ seek,     2, f__v__i_u };
ish[i++]  = (td_sys_call_handler){ tell,     1, f__u__i };

Now, calling a function given one of these structs will require (as you surmised) a switch, with a separate case for each signature. It needs to crack the arguments using stdarg and the call with the appropriate union member function pointer.

void make_sys_call(td_sys_call_handler ish, ...){
    va_list ap;
    int i;
    const char *ccp;
    pid_t pid;
    bool b;
    void *vp;
    unsigned u;
    const void *cvp;
    va_start(ap, ish);
    switch(ish.type) {
    case f__v__f: ish.fp.v__v();
                  break;
    case f__v__i: i = va_arg(int);
                  ish.fp.v__i(i);
                  break;
    case f__pid__ccp: ccp = va_arg(const char *);
                      ish.fp.pid__ccp(ccp);
                      break;
    // etc.
    }
    va_end(ap);
}

It will not be possible to return different types directly. You will either need to allocate a union type variable to hold the return value and return that, or something even more insane. An external stack data type could hold unions of the various return types. Depending on profiling results, it may be appropriate to consider this instead of returning the unions.

HTH.

luser droog
  • 18,988
  • 3
  • 53
  • 105