Fastest and safest way to call functions in extern process

Question

Describtion of the problem:

we need to call a function in extern process as fast as possible. Boost interprocess shared memory is used for communication. The extern process is either mpi master or a single executable. The calculation time of the function lies between 1ms and 1s. The function should be called up to 10^8-10^9 times.

I've tried a lot of possibilities, but I still have some problems with each of them. Here I introduce two of best working implementations

Version 1 ( using intreprocess conditions )

Main-process

bool calculate(double& result, std::vector<double> c){
            // data_ptr is a structure in shared memoty
            data_ptr_->validCalculation = false;
            bool timeout = false;
            // write data (cVec_ is a vector in shared memory )

            cVec_->clear();
            for (int i = 0; i < c.size(); ++i)  
            {
                cVec_->push_back(c[i]);
            }
            // cond_input_data is boost interprocess condition
            data_ptr_->cond_input_data.notify_one();

            boost::system_time const waittime = boost::get_system_time() + boost::posix_time::seconds(maxWaitTime_in_sec);

            // lock slave process
            scoped_lock<interprocess_mutex> lock_output(data_ptr_->mutex_output);

            // wait till data calculated

            timeout = !(data_ptr_->cond_output_data.timed_wait(lock_output, waittime)); // true if timeout, false if no timeout

            if (!timeout)
            {
                // get result
                result = *result_;
                return data_ptr_->validCalculation;
            }
            else
            {
                return false;
            }
        };

Extern process runs a while-loop ( till abort condition is fullfilled)

do {    

    scoped_lock<interprocess_mutex> lock_input(data_ptr_->mutex_input);
    boost::system_time const waittime = boost::get_system_time() + boost::posix_time::seconds(maxWaitTime_in_sec);
    timeout = !(data_ptr_->cond_input_data.timed_wait(lock_input, waittime)); // true if timeout, false if no timeout



    if (!timeout)
    {
        if (!*abort_flag_) {

            c.clear();
            for (int i = 0; i < (*cVec_).size(); ++i)  //Insert data in the vector
            {
                c.push_back(cVec_->at(i));
            }

            // calculate value
            if (call_of_function_here(result, c)) { // valid calculation ?
                *result_ = result;
                data_ptr_->validCalculation = true;
            }
        }
    }

    //Notify the other process that the data is avalible or we dont get the input data
    data_ptr_->cond_output_data.notify_one();           


} while (!*abort_flag_); // while abort flag is not set, check if some values should be calculated

This is best working version, but sometimes it holds up, if the calculation time is short (~1ms). I assume, it happens, if main-process reaches

        data_ptr_->cond_input_data.notify_one();

earlier, than extern process is waiting on

timeout = !(data_ptr_->cond_input_data.timed_wait(lock_input, waittime));

waiting condition. So we have probably some kind of synchronisation problem. Second condition does not help ( i.e. wait only if input data not set, similar to the anonymous condition example with message_in flag). Since, it is still possible, that one process notify the other one, before the second one is waiting for notification.

Version 2 ( using boolean flag and while loop with some delay ) Main-process

bool calculate(double& result, std::vector<double> c){

            data_ptr_->validCalculation = false;
            bool timeout = false;
            // write data
            cVec_->clear();
            for (int i = 0; i < c.size(); ++i)  //Insert data in the vector
            {
                cVec_->push_back(c[i]);
            }

           // this is the flag in shared memory used for communication
            *calc_flag_ = true;

            clock_t test_begin = clock();
            clock_t calc_time_begin = clock();
            do
            {
                calc_time_begin = clock();
                boost::this_thread::sleep(boost::posix_time::milliseconds(while_loop_delay_m_s));
                // wait till data calculated
                timeout = (double(calc_time_begin - test_begin) / CLOCKS_PER_SEC > maxWaitTime_in_sec);
            } while (*(calc_flag_) && !timeout);

            if (!timeout)
            {
                // get result
                result = *result_;
                return data_ptr_->validCalculation;
            }
            else
            {
                return false;
            }

        };

and the extern process

do {    

    // we wait till input data is set
    wait_begin = clock();
    do
    {           
        wait_end = clock();
        timeout = (double(wait_end - wait_begin) / CLOCKS_PER_SEC > maxWaitTime_in_sec);
        boost::this_thread::sleep(boost::posix_time::milliseconds(while_loop_delay_m_s));
    } while (!(*calc_flag_) && !(*abort_flag_) && !timeout);


    if (!timeout)
    {
        if (!*abort_flag_) {

            c.clear();
            for (int i = 0; i < (*cVec_).size(); ++i)  //Insert data in the vector
            {
                c.push_back(cVec_->at(i));
            }

            // calculate value
            if (call_of_local_function(result, c)) { // valid calculation ?
                *result_ = result;
                data_ptr_->validCalculation = true;
            }
        }
    }

    //Notify the other process that the data is avalible or we dont get the input data
    *calc_flag_ = false;


} while (!*abort_flag_); // while abort flag is not set, check if some values should be calculated

The problem in this version is the delay-time. Since we have calculation times close to 1ms, we have to set the delay at least to this value. For smaller delays the cpu-load is high, for higher delays we lose a lot of performance due to not necessary waiting time

Do you have an idea how to improve one of this versions? or may be there is a better solution?

thx.

Well the obvious question is why you are using multilpe processes to do this? Threads are much lighter. What do you want to achieve? How does using multiple processes help — Dennis, Jan 26 '15 at 13:46
Unfortunatelly, the functions we are using, are not threadsafe. So we have to go complicated ways (i.e. to call from main-programm a mpi-process, which execute several instances). We can not use mpi directly from main-program, since plugin-structure is used and main program doesnt know about the execution-type of plugins. — Alexandros, Jan 26 '15 at 14:12
Well obviously using the conditions is better. You need to mutually lock the setting of the notification though. So you need a way for the remote process to check that the input is not already filled before waiting for the fill notification (`if( isNotFull(inputVector) ){ condition.wait(); }`). — Dennis, Jan 26 '15 at 15:09
yeah. I've tried it out, but imangine following scenario (master and extern process running parallel, so the order of calls is not known): 1) for all: "isNotFull" = true; 2) extern: check if-condition, go into condition, but not call "wait" yet, 3) master: isNotFull = false; 4) master: notify all; 5) extern: call wait ---------------------- game over, we hang up — Alexandros, Jan 26 '15 at 15:35

Fastest and safest way to call functions in extern process

0 Answers0