2

I have observed a blocking behavior of C NIFs when they were being called concurrently by many Erlang processes. Can it be made non-blocking? Is there a mutex at work here which I'm not able to comprehend?

P.S. A basic "Hello world" NIF can be tested by making it sleep for a hundred microseconds in case of a particular PID calling it. It can be observed that the other PIDs calling the NIF wait for that sleep to execute before their execution.

Non blocking behavior would be beneficial in cases where concurrency might not pose an issue(e.g. array push, counter increment).

I am sharing the links to 4 gists which comprise of a spawner, conc_nif_caller and niftest module respectively. I have tried to tinker with the value of Val and I have indeed observed a non-blocking behavior. This is confirmed by assigning a large integer parameter to the spawn_multiple_nif_callers function.

Links spawner.erl,conc_nif_caller.erl,niftest.erl and finally niftest.c.

The line below is printed by the Erlang REPL on my Mac.

Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Hynek -Pichi- Vychodil
  • 26,174
  • 5
  • 52
  • 73
abips
  • 209
  • 3
  • 9

3 Answers3

5

NIF's themselves don't have any mutex. You could implement one in C, and there is one when you load NIF's object, but this should be done only once with loading module.

One thing that's might be happening (and I would bet that's what is going on), is you C code messes up Erlang scheduler(s).

A native function that do lengthy work before returning will degrade responsiveness of the VM, and may cause miscellaneous strange behaviors. Such strange behaviors include, but are not limited to, extreme memory usage, and bad load balancing between schedulers. Strange behaviors that might occur due to lengthy work may also vary between OTP releases.

and description of what lengty work means and how you could solve it.

In very few words (with quite few simplifications):

For core one scheduler is created. Each has a list of processes which he can run. If ones scheduler list is empty, he will try to still work from another one. This can fail, if there is nothing (or not enough) to still.

Erlang schedulers spends some amount of work in one process, than moves to another, spend there some amount of work, and move to another. And so on, and so one. This is very similar to scheduling in system processes.

One thing that very important here is calculating amount of work. As default each function call has assigned some number of reductions. Addition could have two, calling function in your module will have one, sending a message also a one, some build-in could have more (like list_to_binary). If we collect 2 000 reductions we move to another process.

So what is the cost of your C function? It's only one reduction.

Code like

loop() ->
   call_nif_function(),
   loop().

could be taking all whole hour, but scheduler will be stuck in this one process, because he still haven't count to 2 000 reductions. Or to put it in other words, he could be stuck inside NIF without possibility to move forward (at least any time soon).

There are few ways around this but general rule is stat NIF's should not take long time. So if you have long running C code, maybe you should use drivers instead. They should be much easier to implement and manage, that tinkering with NIF's.

mpm
  • 3,534
  • 23
  • 33
  • Just as a note, there's been talk about completely replacing C ports with NIFs. It's still a ways off (if it's even going to happen), but the general consensus within the community is that NIFs are now or are at least swiftly becoming the preferred native-execution method. – Soup d'Campbells Oct 23 '14 at 18:38
  • I disagree. First, NIFs are nice for certain things, but drivers are still very relevant especially for network subsystems and other areas that can take advantage of the Erlang emulator's file descriptor polling capabilities (sure, you can implement that yourself in a NIF, but why duplicate what's already portably provided for you?). Second, if anything replaces drivers it will be native processes, not NIFs, but native processes are not trivial and so remain a future work item. – Steve Vinoski Oct 23 '14 at 18:49
  • Native processes is what I was getting at when referring to NIFs here (e.g., that proposal I mentioned in the other comment thread, and hence why I suggest it's "a ways off"). I apologize for any confusion caused by my blending of terms. Ultimately I do see drivers being phased out in favor of native processes and by extension NIFs (NIF API is much easier to use, IMO). Of course you seem better versed than I am with regards to the VM internals, so I'll defer to you here. – Soup d'Campbells Oct 23 '14 at 23:08
  • @Soupd'Campbells yes and no. Simple NIF's are simple. Long running ones not so much. You can see much being done is aspect of NIF's, but some of features are experimental, and you have to introduce additional complexity. Even more this is "different kind" of complexity, which have to handled (understood, managed and debugg) in non standard Erlang way. So while some new features are really cool, I would considered them as a last resource. Drivers are and will be part of standard library. They work, and they work well. If there are no new features, is only because there is no need. IMHO. – mpm Oct 24 '14 at 08:40
4

NIF calls block the scheduler to which the process that called them is bound. So, for your example, if those other processes are on the same scheduler, they cannot call into the NIF until the first process finishes.

You cannot make an NIF call non-blocking in this regard. You can, however, spawn your own threads and offload the brunt of your work to them.

Such threads can send messages to local Erlang processes (processes on the same machine), and as such you can still get the response you desire by waiting for your spawned thread to send back a message.

A bad example:

static ERL_NIF_TERM my_function(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[]) {
    MyStruct* args = new MyStruct(); // I like C++; so sue me
    args->caller = enif_self();
    ErlNifTid thread_id;
    // Please remember, you must at some point rejoin the thread, 
    // so keep track of the thread_id
    enif_thread_create("my_function_thread", &thread_id, my_worker_function, (void*)args, NULL);
    return enif_make_atom(env, "ok");
}
void* my_worker_function(void* args) {
    sleep(100);
    ErlNifEnv* msg_env = enif_alloc_env();
    ERL_NIF_TERM msg = enif_make_atom(msg_env, "ok");
    enif_send(NULL, args->caller, msg_env, msg);
    delete args;
    return NULL;
}

And in your erlang source:

test_nif() -> 
    my_nif:my_function(),
    receive
        ok -> ok
    end.

Something to that effect, anyway.

Soup d'Campbells
  • 2,333
  • 15
  • 14
  • 5
    Just as a note, I don't think it's generally a good idea to create a thread to handle every request that comes in. You'd be better off creating a handful of "worker threads" and communicating with them via some lock-protected resource (or some nifty lock-free data structure). – Soup d'Campbells Oct 23 '14 at 18:05
  • 4
    With Erlang 17, if you have long-running NIF tasks you should be offloading them to [dirty schedulers](http://www.erlang.org/doc/man/erl_nif.html#lengthy_work) instead of writing your own thread pools. – Steve Vinoski Oct 23 '14 at 18:51
  • Yes and no. Things may have changed, but last I checked dirty schedulers were still "experimental", and the implementation was likely to change. Best not to use them for anything in a production environment until they're a finalized feature. – Soup d'Campbells Oct 23 '14 at 18:56
  • 2
    I wrote them. They are very unlikely to change at this point. – Steve Vinoski Oct 23 '14 at 19:10
  • Good to know. I had heard some interesting rumblings about making dirty schedulers run what are effectively complete C processes that I didn't see in the initial spec (such processes would be callback oriented, and have interfaces to call Erlang functions or convert into a pure-Erlang process outright). Holding out hope we might see some of those in the future. – Soup d'Campbells Oct 23 '14 at 19:45
  • If you're curious, those rumblings I heard came from this presentation: http://www.erlang-factory.com/upload/presentations/377/RickardGreen-NativeInterface.pdf Doesn't appear to be an EEP... yet. – Soup d'Campbells Oct 23 '14 at 19:56
4

I think the responses about long-running NIFs are off the mark, since your question says you're running some simple "hello world" code and are sleeping for just 100 us. It's true that ideally a NIF call shouldn't take more than a millisecond, but your NIFs likely won't cause scheduler issues unless they run consistently for tens of milliseconds at a time or more.

I have a simple NIF called rev/1 that takes a string argument, reverses it, and returns the reversed string. I stuck a usleep call in the middle of it, then spawned 100 concurrent Erlang processes to invoke it. The two thread stacktraces shown below, based on Erlang/OTP 17.3.2, show two Erlang scheduler threads both inside the rev/1 NIF simultaneously, one at a breakpoint I set on the NIF C function itself, the other blocked on the usleep inside the NIF:

Thread 18 (process 26016):
#0  rev (env=0x1050d0a50, argc=1, argv=0x102ecc340) at nt2.c:9
#1  0x000000010020f13d in process_main () at beam/beam_emu.c:3525
#2  0x00000001000d5b2f in sched_thread_func (vesdp=0x102829040) at beam/erl_process.c:7719
#3  a0x0000000100301e94 in thr_wrapper (vtwd=0x7fff5fbff068) at pthread/ethread.c:106
#4  0x00007fff8a106899 in _pthread_body ()
#5  0x00007fff8a10672a in _pthread_start ()
#6  0x00007fff8a10afc9 in thread_start ()

Thread 17 (process 26016):
#0  0x00007fff8a0fda3a in __semwait_signal ()
#1  0x00007fff8d205dc0 in nanosleep ()
#2  0x00007fff8d205cb2 in usleep ()
#3  0x000000010062ee65 in rev (env=0x104fcba50, argc=1, argv=0x102ec8280) at nt2.c:21
#4  0x000000010020f13d in process_main () at beam/beam_emu.c:3525
#5  0x00000001000d5b2f in sched_thread_func (vesdp=0x10281ed80) at beam/erl_process.c:7719
#6  0x0000000100301e94 in thr_wrapper (vtwd=0x7fff5fbff068) at pthread/ethread.c:106
#7  0x00007fff8a106899 in _pthread_body ()
#8  0x00007fff8a10672a in _pthread_start ()
#9  0x00007fff8a10afc9 in thread_start ()

If there were any mutexes within the Erlang emulator preventing concurrent NIF access, the stacktraces would not show both threads inside the C NIF.

It would be nice if you were to post your code so those willing to help resolve this issue could see what you're doing and perhaps help you find any bottlenecks. It would also be helpful if you were to tell us what version(s) of Erlang/OTP you're using.

Steve Vinoski
  • 19,847
  • 3
  • 31
  • 46
  • My answer wasn't so much oriented around long-running NIF calls as it was about whether or not the emulator he's running was SMP-enabled and, if such, whether or not it had multiple schedulers. In a single-scheduler scenario, a NIF which called sleep would *technically* be blocking other processes, but would really just be blocking the scheduler from getting to them. As such, a user thread or dirty scheduler (as you suggested) would allow the (clean) scheduler to resume running other processes. – Soup d'Campbells Oct 23 '14 at 19:49
  • Yes, you're right. Its indeed non blocking. I have shared the links to the code. I was playing with smaller values leading to the erroneous conclusion of mine earlier. – abips Oct 23 '14 at 22:30
  • 1
    @abips - Erlang has a particular load-balancing mechanism for processes described at a high level by mpm in his answer. Specifically, Erlang attempts to bind as many processes to a single scheduler with the goal of saturating that scheduler's CPU time with runnable processes (and migrating the excess). This affects spawn in that new processes will often begin on the scheduler where the spawn call is made. Therefore, with a low number of workers you're more likely to observe what appears to be blocking behavior from sleep, but it's just that all of the processes are on one (sleeping) scheduler. – Soup d'Campbells Oct 23 '14 at 23:23
  • @Soupd'Campbells Thanks for the explanation. Definitely makes sense. – abips Oct 23 '14 at 23:29