0

Describtion of the problem:

we need to call a function in extern process as fast as possible. Boost interprocess shared memory is used for communication. The extern process is either mpi master or a single executable. The calculation time of the function lies between 1ms and 1s. The function should be called up to 10^8-10^9 times.

I've tried a lot of possibilities, but I still have some problems with each of them. Here I introduce two of best working implementations

Version 1 ( using intreprocess conditions )

Main-process

bool calculate(double& result, std::vector<double> c){
            // data_ptr is a structure in shared memoty
            data_ptr_->validCalculation = false;
            bool timeout = false;
            // write data (cVec_ is a vector in shared memory )

            cVec_->clear();
            for (int i = 0; i < c.size(); ++i)  
            {
                cVec_->push_back(c[i]);
            }
            // cond_input_data is boost interprocess condition
            data_ptr_->cond_input_data.notify_one();

            boost::system_time const waittime = boost::get_system_time() + boost::posix_time::seconds(maxWaitTime_in_sec);

            // lock slave process
            scoped_lock<interprocess_mutex> lock_output(data_ptr_->mutex_output);

            // wait till data calculated

            timeout = !(data_ptr_->cond_output_data.timed_wait(lock_output, waittime)); // true if timeout, false if no timeout

            if (!timeout)
            {
                // get result
                result = *result_;
                return data_ptr_->validCalculation;
            }
            else
            {
                return false;
            }
        };

Extern process runs a while-loop ( till abort condition is fullfilled)

do {    

    scoped_lock<interprocess_mutex> lock_input(data_ptr_->mutex_input);
    boost::system_time const waittime = boost::get_system_time() + boost::posix_time::seconds(maxWaitTime_in_sec);
    timeout = !(data_ptr_->cond_input_data.timed_wait(lock_input, waittime)); // true if timeout, false if no timeout



    if (!timeout)
    {
        if (!*abort_flag_) {

            c.clear();
            for (int i = 0; i < (*cVec_).size(); ++i)  //Insert data in the vector
            {
                c.push_back(cVec_->at(i));
            }

            // calculate value
            if (call_of_function_here(result, c)) { // valid calculation ?
                *result_ = result;
                data_ptr_->validCalculation = true;
            }
        }
    }

    //Notify the other process that the data is avalible or we dont get the input data
    data_ptr_->cond_output_data.notify_one();           


} while (!*abort_flag_); // while abort flag is not set, check if some values should be calculated

This is best working version, but sometimes it holds up, if the calculation time is short (~1ms). I assume, it happens, if main-process reaches

        data_ptr_->cond_input_data.notify_one();

earlier, than extern process is waiting on

timeout = !(data_ptr_->cond_input_data.timed_wait(lock_input, waittime)); 

waiting condition. So we have probably some kind of synchronisation problem. Second condition does not help ( i.e. wait only if input data not set, similar to the anonymous condition example with message_in flag). Since, it is still possible, that one process notify the other one, before the second one is waiting for notification.

Version 2 ( using boolean flag and while loop with some delay ) Main-process

bool calculate(double& result, std::vector<double> c){

            data_ptr_->validCalculation = false;
            bool timeout = false;
            // write data
            cVec_->clear();
            for (int i = 0; i < c.size(); ++i)  //Insert data in the vector
            {
                cVec_->push_back(c[i]);
            }

           // this is the flag in shared memory used for communication
            *calc_flag_ = true;

            clock_t test_begin = clock();
            clock_t calc_time_begin = clock();
            do
            {
                calc_time_begin = clock();
                boost::this_thread::sleep(boost::posix_time::milliseconds(while_loop_delay_m_s));
                // wait till data calculated
                timeout = (double(calc_time_begin - test_begin) / CLOCKS_PER_SEC > maxWaitTime_in_sec);
            } while (*(calc_flag_) && !timeout);

            if (!timeout)
            {
                // get result
                result = *result_;
                return data_ptr_->validCalculation;
            }
            else
            {
                return false;
            }

        };

and the extern process

do {    

    // we wait till input data is set
    wait_begin = clock();
    do
    {           
        wait_end = clock();
        timeout = (double(wait_end - wait_begin) / CLOCKS_PER_SEC > maxWaitTime_in_sec);
        boost::this_thread::sleep(boost::posix_time::milliseconds(while_loop_delay_m_s));
    } while (!(*calc_flag_) && !(*abort_flag_) && !timeout);


    if (!timeout)
    {
        if (!*abort_flag_) {

            c.clear();
            for (int i = 0; i < (*cVec_).size(); ++i)  //Insert data in the vector
            {
                c.push_back(cVec_->at(i));
            }

            // calculate value
            if (call_of_local_function(result, c)) { // valid calculation ?
                *result_ = result;
                data_ptr_->validCalculation = true;
            }
        }
    }

    //Notify the other process that the data is avalible or we dont get the input data
    *calc_flag_ = false;


} while (!*abort_flag_); // while abort flag is not set, check if some values should be calculated

The problem in this version is the delay-time. Since we have calculation times close to 1ms, we have to set the delay at least to this value. For smaller delays the cpu-load is high, for higher delays we lose a lot of performance due to not necessary waiting time

Do you have an idea how to improve one of this versions? or may be there is a better solution?

thx.

Alexandros
  • 21
  • 3
  • Well the obvious question is why you are using multilpe processes to do this? Threads are much lighter. What do you want to achieve? How does using multiple processes help – Dennis Jan 26 '15 at 13:46
  • Unfortunatelly, the functions we are using, are not threadsafe. So we have to go complicated ways (i.e. to call from main-programm a mpi-process, which execute several instances). We can not use mpi directly from main-program, since plugin-structure is used and main program doesnt know about the execution-type of plugins. – Alexandros Jan 26 '15 at 14:12
  • Well obviously using the conditions is better. You need to mutually lock the setting of the notification though. So you need a way for the remote process to check that the input is not already filled before waiting for the fill notification (`if( isNotFull(inputVector) ){ condition.wait(); }`). – Dennis Jan 26 '15 at 15:09
  • yeah. I've tried it out, but imangine following scenario (master and extern process running parallel, so the order of calls is not known): 1) for all: "isNotFull" = true; 2) extern: check if-condition, go into condition, but not call "wait" yet, 3) master: isNotFull = false; 4) master: notify all; 5) extern: call wait ---------------------- game over, we hang up – Alexandros Jan 26 '15 at 15:35

0 Answers0