Describtion of the problem:
we need to call a function in extern process as fast as possible. Boost interprocess shared memory is used for communication. The extern process is either mpi master or a single executable. The calculation time of the function lies between 1ms and 1s. The function should be called up to 10^8-10^9 times.
I've tried a lot of possibilities, but I still have some problems with each of them. Here I introduce two of best working implementations
Version 1 ( using intreprocess conditions )
Main-process
bool calculate(double& result, std::vector<double> c){
// data_ptr is a structure in shared memoty
data_ptr_->validCalculation = false;
bool timeout = false;
// write data (cVec_ is a vector in shared memory )
cVec_->clear();
for (int i = 0; i < c.size(); ++i)
{
cVec_->push_back(c[i]);
}
// cond_input_data is boost interprocess condition
data_ptr_->cond_input_data.notify_one();
boost::system_time const waittime = boost::get_system_time() + boost::posix_time::seconds(maxWaitTime_in_sec);
// lock slave process
scoped_lock<interprocess_mutex> lock_output(data_ptr_->mutex_output);
// wait till data calculated
timeout = !(data_ptr_->cond_output_data.timed_wait(lock_output, waittime)); // true if timeout, false if no timeout
if (!timeout)
{
// get result
result = *result_;
return data_ptr_->validCalculation;
}
else
{
return false;
}
};
Extern process runs a while-loop ( till abort condition is fullfilled)
do {
scoped_lock<interprocess_mutex> lock_input(data_ptr_->mutex_input);
boost::system_time const waittime = boost::get_system_time() + boost::posix_time::seconds(maxWaitTime_in_sec);
timeout = !(data_ptr_->cond_input_data.timed_wait(lock_input, waittime)); // true if timeout, false if no timeout
if (!timeout)
{
if (!*abort_flag_) {
c.clear();
for (int i = 0; i < (*cVec_).size(); ++i) //Insert data in the vector
{
c.push_back(cVec_->at(i));
}
// calculate value
if (call_of_function_here(result, c)) { // valid calculation ?
*result_ = result;
data_ptr_->validCalculation = true;
}
}
}
//Notify the other process that the data is avalible or we dont get the input data
data_ptr_->cond_output_data.notify_one();
} while (!*abort_flag_); // while abort flag is not set, check if some values should be calculated
This is best working version, but sometimes it holds up, if the calculation time is short (~1ms). I assume, it happens, if main-process reaches
data_ptr_->cond_input_data.notify_one();
earlier, than extern process is waiting on
timeout = !(data_ptr_->cond_input_data.timed_wait(lock_input, waittime));
waiting condition. So we have probably some kind of synchronisation problem. Second condition does not help ( i.e. wait only if input data not set, similar to the anonymous condition example with message_in flag). Since, it is still possible, that one process notify the other one, before the second one is waiting for notification.
Version 2 ( using boolean flag and while loop with some delay ) Main-process
bool calculate(double& result, std::vector<double> c){
data_ptr_->validCalculation = false;
bool timeout = false;
// write data
cVec_->clear();
for (int i = 0; i < c.size(); ++i) //Insert data in the vector
{
cVec_->push_back(c[i]);
}
// this is the flag in shared memory used for communication
*calc_flag_ = true;
clock_t test_begin = clock();
clock_t calc_time_begin = clock();
do
{
calc_time_begin = clock();
boost::this_thread::sleep(boost::posix_time::milliseconds(while_loop_delay_m_s));
// wait till data calculated
timeout = (double(calc_time_begin - test_begin) / CLOCKS_PER_SEC > maxWaitTime_in_sec);
} while (*(calc_flag_) && !timeout);
if (!timeout)
{
// get result
result = *result_;
return data_ptr_->validCalculation;
}
else
{
return false;
}
};
and the extern process
do {
// we wait till input data is set
wait_begin = clock();
do
{
wait_end = clock();
timeout = (double(wait_end - wait_begin) / CLOCKS_PER_SEC > maxWaitTime_in_sec);
boost::this_thread::sleep(boost::posix_time::milliseconds(while_loop_delay_m_s));
} while (!(*calc_flag_) && !(*abort_flag_) && !timeout);
if (!timeout)
{
if (!*abort_flag_) {
c.clear();
for (int i = 0; i < (*cVec_).size(); ++i) //Insert data in the vector
{
c.push_back(cVec_->at(i));
}
// calculate value
if (call_of_local_function(result, c)) { // valid calculation ?
*result_ = result;
data_ptr_->validCalculation = true;
}
}
}
//Notify the other process that the data is avalible or we dont get the input data
*calc_flag_ = false;
} while (!*abort_flag_); // while abort flag is not set, check if some values should be calculated
The problem in this version is the delay-time. Since we have calculation times close to 1ms, we have to set the delay at least to this value. For smaller delays the cpu-load is high, for higher delays we lose a lot of performance due to not necessary waiting time
Do you have an idea how to improve one of this versions? or may be there is a better solution?
thx.