0

The software I'm working on is a data analyzer with a sliding window. I have 2 threads, one producer and one consumer, that use a circular buffer.

The consumer must process data only if the first element in the buffer is old enough, therefore there are at least X elements in the buffer. But after the processing, only X/4 data can be deleted, because of the moving window.

My solution below works quite well, except that I have a trade-off between being fast (busy form of waiting in the check), or being efficient (sleep for some time). The problem is that the sleep time varies according to load, thread scheduling and elaboration complexity, so I can potentially slow down the performances.

Is there a way to poll a semaphore to check if there are at least X elements, blocking the thread otherwise, but acquiring only X/4 after the processing has been done? The tryAcquire option does not work because when it wakes the thread consumes all the data, and not one half.

I've thought about copyng the elements in a second buffer, but actually there are 7 circular buffers of big data, therefore I'd like to avoid data duplication, or even data moving.

//common structs
QSemaphore written;
QSemaphore free;
int writtenIndex = 0;
int readIndex = 0;
myCircularBuffer buf;
bool scan = true;


//producer
void produceData(data d)
{
    while ( free.tryAcquire(1, 1000) == false && scan == true)
    {
        //avoid deadlock!
        //once per second give up waiting and check if closing
    }

    if (scan == false) return;

    buf.at(writtenIndex) = d;
    writtenIndex = (writtenIndex+1) % bufferSize;
    written.release();
}


//consumer
void consumeData()
{
while(1)
{
    //here goes the problem: usleep (slow), sched_yield (B.F.O.W.) or what?  
    if (buf.at(writtenIndex).age - buf.at(readIndex).age < X)
    {
        //usleep(100); ? how much time?
        //sched_yield(); ?
        //tryAcquire not an option!
        continue;
    }

    processTheData();

    written.acquire(X/4);
    readIndex = (readIndex + X/4) % bufferSize; 
    free.release(X/4);
}
rookie coder
  • 141
  • 1
  • 8
  • It's odd that you're using both `scan == true` here and `!scan` in the same code. If `!scan` works as a boolean condition then there's no need for `== true`. Same goes for `== false` if you're assured that will only return a bool. – tadman Jul 18 '17 at 14:36
  • Another thing to consider is having the consumer sleep by default when it's done working and push the responsibility for waking it up to the producer. – tadman Jul 18 '17 at 14:37
  • typo, I was in a rush. It's a bool anyway. I usually like more readability. – rookie coder Jul 18 '17 at 14:38
  • Using `scan` and `!scan` is generally preferable to `scan == true` and `scan == false` as it's a lot more concise and avoids clutter. The strict comparison is usually done in other languages where the value might be a bool, or maybe something else like a string. – tadman Jul 18 '17 at 14:39
  • @tadman the producer inserts one at time, and has no knowledge of how much the consumer has read nor how much buffer reads, so I don't see how it can wake the consumer up. Am I missing something? And my question is not really about coding style: I like to check bools explicitely ;-) – rookie coder Jul 18 '17 at 14:43
  • I'm just giving you advice here to help your code look more confident and less confused. What you like is one thing, but it also seems paranoid. Don't forget `if (scan)` is an explicit check if `scan` is of type `bool`. – tadman Jul 18 '17 at 14:54
  • As for the timer issue, if you're constructing a priority queue (based on time) where you want to sleep until the next available job then the consumer thread can sleep X seconds until the highest priority job comes due by figuring out how long that will be. If the producer injects a higher priority job it would need to send a signal to wake up the consumer prematurely so it can recalculate how long to wait. – tadman Jul 18 '17 at 14:55
  • @tadman If I read quickly, I could miss that '!', while writing == true/false is more readable. It's not being few confident, but more relaxed. Paranoia would have been 'true == scan'. ;-) And it's not a priority queue, it's just some data to process, but there must be enough of it, and age is just a generic metric, not a time. – rookie coder Jul 18 '17 at 15:03
  • Things like `!` have important meaning, so over time you'll develop an eye for them and they'll jump out at you. In any case, if that helps you get by, that's your call. As for your scheduling problem, the code here doesn't communicate the totality of your problem with respect to scheduling. It's not clear if polling at a few hundred cycles per second and doing a quick calculation would be a good-enough solution or not. – tadman Jul 18 '17 at 15:10
  • @tadman I know meaning of things like `!`, and I *do* have an eye for them, but I prefer readability over all. Better safe than sorry. Furthermore, if someone else reads my code, knows that the variable is a bool and not an int, for example. And the problem is quite clear in my opinion: is it possible to block on wait on a semaphore without using the data when it's available? I'll update the title. – rookie coder Jul 18 '17 at 15:37
  • Semaphores are usually pretty primitive, like they're either blocking or not. They don't have thresholds. If you need a minimum pressure to unblock then that probably requires some other method like a special lock using an atomically adjusted counter. – tadman Jul 18 '17 at 15:51
  • What do you mean, "without using the data?" Semaphores don't have data. The only property a semaphore has is its number of available _permits_. – Solomon Slow Jul 18 '17 at 16:31
  • Re, " the producer ... has no knowledge of how much the consumer has read." That's only because of your design choice: Threads communicate through shared variables, and if they're sharing one variable, then there's no real penalty for sharing more than one. There is no reason why the producer can not know everything that the consumer knows and vice versa. For example, if it is possible for the consumer to know whether it is allowed to take data when it wakes up, then it is equally possible for the producer to know the same and, avoid waking the consumer before the consumer can do its work. – Solomon Slow Jul 18 '17 at 16:35

0 Answers0