I've written a very simple C library for Lua, which consists of a single function that starts a thread, with said thread doing nothing but looping :
#include "lua.h"
#include "lauxlib.h"
#include <pthread.h>
#include <stdio.h>
pthread_t handle;
void* mythread(void* args)
{
printf("In the thread !\n");
while(1);
pthread_exit(NULL);
}
int start_mythread()
{
return pthread_create(&handle, NULL, mythread, NULL);
}
int start_mythread_lua(lua_State* L)
{
lua_pushnumber(L, start_mythread());
return 1;
}
static const luaL_Reg testlib[] = {
{"start_mythread", start_mythread_lua},
{NULL, NULL}
};
int luaopen_test(lua_State* L)
{
/*
//for lua 5.2
luaL_newlib(L, testlib);
lua_setglobal(L, "test");
*/
luaL_register(L, "test", testlib);
return 1;
}
Now, if I write a very simple Lua script that just does :
require("test")
test.start_mythread()
Running the script with lua myscript.lua
will sometimes cause a segfault. Here's what GDB has to say about the core dump :
Program terminated with signal 11, Segmentation fault.
#0 0xb778b75c in ?? ()
(gdb) thread apply all bt
Thread 2 (Thread 0xb751c940 (LWP 29078)):
#0 0xb75b3715 in _int_free () at malloc.c:4087
#1 0x08058ab9 in l_alloc ()
#2 0x080513a2 in luaM_realloc_ ()
#3 0x0805047b in sweeplist ()
#4 0x080510ef in luaC_freeall ()
#5 0x080545db in close_state ()
#6 0x0804acba in main () at lua.c:389
Thread 1 (Thread 0xb74efb40 (LWP 29080)):
#0 0xb778b75c in ?? ()
#1 0xb74f6efb in start_thread () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0
#2 0xb7629dfe in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129
With a few variations in the stack of the main thread from time to time.
It seems the start_thread function wants to jump to a given address (in this instance, b778b75c) that sometimes happens to belong to unreachable memory.
Edit
I also have a valgrind output :
==642== Memcheck, a memory error detector
==642== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==642== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==642== Command: lua5.1 go.lua
==642==
In the thread !
In the thread !
==642== Thread 2:
==642== Jump to the invalid address stated on the next line
==642== at 0x403677C: ???
==642== by 0x46BEEFA: start_thread (pthread_create.c:309)
==642== by 0x41C1DFD: clone (clone.S:129)
==642== Address 0x403677c is not stack'd, malloc'd or (recently) free'd
==642==
==642==
==642== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==642== Access not within mapped region at address 0x403677C
==642== at 0x403677C: ???
==642== by 0x46BEEFA: start_thread (pthread_create.c:309)
==642== by 0x41C1DFD: clone (clone.S:129)
==642== If you believe this happened as a result of a stack
==642== overflow in your program's main thread (unlikely but
==642== possible), you can try to increase the size of the
==642== main thread stack using the --main-stacksize= flag.
==642== The main thread stack size used in this run was 8388608.
==642==
==642== HEAP SUMMARY:
==642== in use at exit: 1,296 bytes in 6 blocks
==642== total heap usage: 515 allocs, 509 frees, 31,750 bytes allocated
==642==
==642== LEAK SUMMARY:
==642== definitely lost: 0 bytes in 0 blocks
==642== indirectly lost: 0 bytes in 0 blocks
==642== possibly lost: 136 bytes in 1 blocks
==642== still reachable: 1,160 bytes in 5 blocks
==642== suppressed: 0 bytes in 0 blocks
==642== Rerun with --leak-check=full to see details of leaked memory
==642==
==642== For counts of detected and suppressed errors, rerun with: -v
==642== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Killed
However, I've been fine so far just opening the lua interpreter and entering the same instructions manually one after the other.
Also, a C program that does the same thing, using the same lib :
int start_mythread();
int main()
{
int ret = start_mythread();
return ret;
}
As it should, has never failed during my tests.
I've tried with both Lua 5.1 and 5.2, to no avail.
Edit: I should point out I tested this on a single-core eeePC running 32-bit Debian Wheezy (Linux 3.2).
I've just tested again on my main machine (4-core 64-bit Arch linux), and launching the script with lua myscript.lua
segfaults every time there...
Entering the commands from the interpreter prompt works fine though, as well as the C program above.
The reason I've written this small lib in the first place is because I'm writing a bigger library, with which I've first had this problem. After hours of unfruitful debugging, including removing every shared structures/variables one by one (yes, I was that desperate), I've come down to this piece of code.
So, my guess is there's something that I'm doing wrong with Lua, but what could that be ? I've searched this issue as much as I could, but what I found was mostly people having problems with using the Lua API from several threads (which isn't what I'm trying to do here).
If you have an idea, any help would be much appreciated.
Edit
To be more precise, I'd like to know if I should take extra precautions with threads when writing a C lib for use within Lua scripts. Does Lua need threads created from within a dynamically loaded library to be terminated when it "unloads" the library ?