7

I'm trying to return struct from shared library written in C. This is simple code, for testing of returning structure and simple int32, libstruct.c, compiled by gcc -shared -Wl,-soname,libstruct.so.1 -o libstruct.so.1 libstruct.c:

#include <stdint.h>

int32_t newint(int32_t arg) {
    return arg;
}

struct MyStruct {
    int32_t member;
};
struct MyStruct newstruct(int32_t arg) {
    struct MyStruct myStruct;
    myStruct.member = arg;
    return(myStruct);
}

I can use this library with simple C program, usestruct.c, compiled by gcc -o usestruct usestruct.c ./libstruct.so.1:

#include <stdio.h>
#include <stdint.h>

struct MyStruct {
    int32_t member;
};
extern struct MyStruct newstruct(int32_t);
extern int32_t newint(int32_t);

int main() {
    printf("%d\n", newint(42));
    struct MyStruct myStruct;
    myStruct = newstruct(42);
    printf("%d\n", myStruct.member);
    return 0;
}

I can launch it with LD_LIBRARY_PATH=./ ./usestruct, and it works correctly, prints two values. Now, let's to write analogous program in raku, usestruct.raku:

#!/bin/env raku
use NativeCall;

sub newint(int32) returns int32 is native('./libstruct.so.1') { * }
say newint(42);

class MyStruct is repr('CStruct') {
    has int32 $.member;
}
sub newstruct(int32) returns MyStruct is native('./libstruct.so.1') { * }
say newstruct(42).member;

This prints first 42, but then terminates with segmentation fault.

In C this example works, but I'm not expert in C, maybe I forgot something, some compile options? Or is this a bug of rakudo?

fingolfin
  • 591
  • 9

2 Answers2

5

NativeCall interface requires that transaction of C structs be made with pointers:

CStruct objects are passed to native functions by reference and native functions must also return CStruct objects by reference.

Your C function, however, returns a new struct by value. Then, i guess, this is tried to be interpreted as a memory address as it expects a pointer, and tries to read/write from wild memory areas, hence the segfault.

You can pointerize your function as:

struct MyStruct* newstruct(int32_t val) {
    /* dynamically allocating now */
    struct MyStruct *stru = malloc(sizeof *stru);
    stru->member = val;
    return stru;
}

with #include <stdlib.h> at the very top for malloc. Raku program is essentially the same modulo some aesthetics:

# prog.raku
use NativeCall;

my constant LIB = "./libstruct.so";

class MyStruct is repr("CStruct") {
    has int32 $.member;
}

# C bridge
sub newint(int32) returns int32 is native(LIB) { * }
sub newstruct(int32) returns MyStruct is native(LIB) { * }

say newint(42);

my $s := newstruct(84);
say $s;
say $s.member;

We build the lib & run the Raku program to get

$ gcc -Wall -Wextra -pedantic -shared -o libstruct.so -fPIC mod_struct.c
$ raku prog.raku
42
MyStruct.new(member => 84)
84

(took the liberty to rename C file to "mod_struct.c")

Seems good. But there's an issue: now that a dynamic allocation was made, responsibility to deliver it back arises. And we need to do it ourselves with a C-bridged freer:

When a CStruct-based type is used as the return type of a native function, the memory is not managed for you by the GC.

So

/* addendum to mod_struct.c */
void free_struct(struct MyStruct* s) {
    free(s);
}

Noting that, since the struct itself didn't have dynamic allocations on its members (as it only has an integer), we didn't do further freeing.

Now the Raku program needs to be aware of this, and use it:

# prog.raku
use NativeCall;

my constant LIB = "./libstruct.so";

class MyStruct is repr("CStruct") {
    has int32 $.member;
}

# C bridge
sub newint(int32) returns int32 is native(LIB) { * }
sub newstruct(int32) returns MyStruct is native(LIB) { * }
sub free_struct(MyStruct) is native(LIB) { * };   # <-- new!

say newint(42);

my $s := newstruct(84);
say $s;
say $s.member;

# ... after some time
free_struct($s);
say "successfully freed struct";

and the output follows as

42
MyStruct.new(member => 84)
84
successfully freed struct

Manually keeping track of MyStruct objects to remember freeing them after some time might be cumbersome; that would be writing C! In the Raku level, we already have a class representing the struct; then we can add a DESTROY submethod to it that frees itself whenever garbage collector deems necessary:

class MyStruct is repr("CStruct") {
    has int32 $.member;

    submethod DESTROY {
        free_struct(self);
    }
}

With this addition, no manual calls to free_struct is needed (in fact, better not because it might lead double freeing which is undefined behaviour on C level).


P.S. your main C file might be revisioned, e.g., a header file seems in order but that's out of scope or that was only a demonstrative example who knows. In either case, thanks for providing an MRE and welcome to the website.

Mustafa Aydın
  • 17,645
  • 4
  • 15
  • 38
  • 1
    "NativeCall interface requires ...". I barely know C and definitely not enough to intuitively understand why any given NativeCall requirement is the way it is, in particular whether NativeCall may one day be improved to reduce or eliminate any given current restriction. Do you know enough to comment on whether this requirement might one day be eliminated (and if so, do you know of any references to where that has been discussed?) If you do, please consider adding this to your answer, or at least commenting. Your answer seems to me to already be outstanding, but that just makes me want more! – raiph Nov 06 '22 at 16:02
  • 2
    hi @raiph, thanks for your kind words. I don't exactly know why it is the way it is. I can speculate though :) It's better practice to pass around structs by reference in C especially when they are large, as pass-by-value copies the entire thing and not only speed is lost but also function's stack may overflow. With a pointer both of these are no longer issues. However, it makes the struct `is rw` on C level :) [contd] – Mustafa Aydın Nov 06 '22 at 16:21
  • 1
    [contd] (this may or may not be desired; if desired, one more advantage to pointers!). Given that you're probably also the writer of the C part and/or it's already easy to go wrong in C, I think pointer choice is cool. But again, the developers of the NativeCall interface and those with more C knowledge would know better than me. – Mustafa Aydın Nov 06 '22 at 16:22
  • 1
    @MustafaAydın, I noticed that you are using binding: `my $s := newstruct(84);`. Does this avoid unnecessary copying of the returned structure? – fingolfin Nov 06 '22 at 19:22
  • 2
    @fingolfin Sorry should have clarified that! Answer is: not really. With `:=` instead of `=` there, i'm avoiding the wrapping of a Scalar container around the RHS value (MyStruct instance in this case). It (`:=`) will make `$s` directly "look" at the RHS value, so to speak, and prevent re-assignments to `$s` (e.g., `$s = -7` would fail after that point; so kind of imposed immutability, a testament). You can see about Scalars and containers [here](https://docs.raku.org/type/Scalar) and [here](https://docs.raku.org/language/containers). (Also: your answer is superb, thanks for that!) – Mustafa Aydın Nov 06 '22 at 20:03
5

In addition to great @Mustafa's answer.

I found another way to solve my problem: we can allocate structure in raku and pass it to C function. Here is an example, file mod_struct.c:

#include <stdint.h>

struct MyStruct {
    int32_t member;
};
void writestruct(struct MyStruct *outputStruct, int32_t arg) {
    outputStruct->member = arg;
}

File usestruct.raku:

#!/bin/env raku
use NativeCall;

class MyStruct is repr('CStruct') {
    has int32 $.member;
}
sub writestruct(MyStruct is rw, int32) is native('./libstruct.so') { * }

my $myStruct = MyStruct.new;
writestruct($myStruct, 42);
say $myStruct.member;

Compile and run it:

$ gcc -Wall -Wextra -pedantic -shared -o libstruct.so -fPIC mod_struct.c
$ ./usestruct.raku
42
fingolfin
  • 591
  • 9
  • 2
    again, thanks for sharing this. Would like to point out a caveat here: this approach works fine with structs for non-pointer-involving members (e.g., int and floats); however, if it has a string (char *) or an array, the Raku-made values are passed fine allright but they cannot be safely used on the C-level without cloning because the reference can be destroyed on Raku-level (maybe even immediately!) and then C-level would be looking at forbidden area in memory. For them, dynamic allocation in C-level is needed. – Mustafa Aydın Nov 07 '22 at 18:08
  • 2
    For example, if you run the example [here](https://docs.raku.org/language/nativecall#Notes_on_memory_management) without `strdup` but everything else is the same, I do get "foo is str and 123"sometimes... but! I also get "foo is &sayM and 123", i.e., strange memory reading, and even an error sometimes as "Malformed termination of UTF-8 string", implying the passed string is sometimes already gone (freed) and we're facing undefined behaviour as seen. – Mustafa Aydın Nov 07 '22 at 18:09
  • 1
    Thank you again for your explanations, @MustafaAydın. I read this page of documentation many times, but thanks to you I began to understand it! – fingolfin Nov 08 '22 at 17:33