-1

I'm looking to create a moving window that stores boolean values and outputs the most common boolean value.

I want to be able to add new values on the fly, for example:

bool[] window = { false, false, true, true, true };

New 'false' value added, array shifted:

bool[] window = { false, true, true, true, false };

Expected output would be 'true'.

What is the best/most efficient way to go about this? should I use LINQ? any examples would be much appreciated. Thank you.

embedded.95
  • 63
  • 1
  • 8
  • 1
    Why do think LINQ would be most efficient? Why do think a bool[] is good for that? Does it need to be an adjustable window? What may be it's max width? Do you have an Interface for your API to that datastructure, so you can benchmark different solutions against each other? Does this need to be inherently Threadsafe? – Fildor Apr 14 '22 at 15:03
  • Why its an array? If you want to add or remove things and the order is relevant use a List. When you have that, with or without Linq is then mostly a question of taste and not efficiency. – Ralf Apr 14 '22 at 15:11
  • Oh. Just understood you might be more clear here i just got it when i focused on the word "shifted". – Ralf Apr 14 '22 at 15:13
  • An array is neither best nor efficient. Use a `Queue` instead. – Hans Passant Apr 14 '22 at 15:29
  • That's a ring buffer anyway. – Etienne de Martel Apr 14 '22 at 15:49
  • A ring buffer implemented on an array is much more efficient than a `Queue`. You avoid copying all the values to "shift" the array by keeping track of a rotation index. – Ben Voigt Apr 14 '22 at 16:41
  • Will you be reading the "most common" value (the correct name for this statistic is **mode**) as often as shifting new values through, or do you read more often than you write, or do you write much more often than you read? – Ben Voigt Apr 14 '22 at 16:43

3 Answers3

3

Here's a version which has O(1) insertion and read, unlike Joel's which has O(N) read.

public class BoolWindowedMode
{
    private readonly bool[] history;
    private int used;
    private int count;
    private int rotation;

    public BoolWindow(int width)
    { history = new bool[width]; }

    public void Insert(bool newValue)
    {
        // remove old entry from count
        if (history[rotation]) --count;
        // count new entry
        if (newValue) ++count;
        // replace old entry in history, shifting
        history[rotation] = newValue;
        if (++rotation >= history.Length) rotation = 0;
        if (used < history.Length) ++used;
    }

    public int CountOfTrue => count;
    public int CountOfFalse => used - count;
    public bool Mode => count > used - count;
}

If you only need "correct" results once enough values are inserted to fill the window, then you can eliminate the used variable.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Very elegant solution for this problem ! My own solution actually came from a more generic infrastructure I have for cyclic queues and supports other methods which are not strickly required for the problem at hand. – wohlstad Apr 14 '22 at 17:06
  • @wohlstad: Yeah, it's too bad C# `bool` doesn't have a promotion to `int` or else `Insert()` could do `count += newValue - history[rotation];` eliminating several `if` branches. – Ben Voigt Apr 14 '22 at 17:08
  • Your solution certainly deserves a vote. Since I'm out of votes for today I'll do it tomorrow. – wohlstad Apr 14 '22 at 17:20
0
public class BoolWindow : IEnumerable<bool>
{
    private static int MaxSize = 5;
    private Queue<bool> data = new Queue<bool>(MaxSize);

    public void Add(bool newValue)
    {
        if (data.Count >= MaxSize) data.Dequeue();
        data.Enqueue(newValue);
    }

    public bool MostFrequentValue()
    {
        //What do you want if the size is even and both true and false are the same?
        return data.Select(b => b?1:-1).Sum() > 0;

        // Also: we could optimize this by looping manually until one value
        // is > Count/2, but that's probably more trouble than it's worth
    }    

    public IEnumerator<bool> GetEnumerator()
    {
       return data.GetEnumerator();
    }
    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }
}
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • This is quite inefficient if `MostFrequentValue()` is called frequently. – Ben Voigt Apr 14 '22 at 16:46
  • @BenVoigt Yep. We could also update a member field in the `Add()` method and simply return the field. But as asked, we're talking about 5 items, so I figure it's not worth it yet. – Joel Coehoorn Apr 14 '22 at 17:03
0

You can indeed use a Queue<bool> like already suggested. But it might cause a performance issue in some cases. After the 5 first insertions, every insertion will cost a Dequeue and an Enqueue.

Efficiency depends on the implementation of .NET which I am not an expert in, but it could be sub-optimal. Assuming the Queue is at its capacity - Dequeue can just increment the head index, but then Enqueue will require a reallocation. If Dequeue is not just incrementing the index it might require to copy the whole Queue.

Since Queue is not limited to a predefined window size, its implementation might be inferior relative to a specific cyclic queue implementation.

Therefore - if efficiency is critical, you can try to use a fixed size array, and manage a cyclic queue on top of it.

BTW - If the number if elements is even (e.g. before we got 5 entries) and we have the same number of true and false, it is not clear what the most common value should be. I arbitrarily chose false.

The implementation:

public class CyclicQueueBool 
{
    public CyclicQueueBool(int maxSize)
    {
        int internalSize = maxSize + 1;
        m_Queue_ = new bool[internalSize];
        m_Head_ = 0;
        m_Tail_ = 0;
    }

    // Get the actual number of elements in the queue (smaller than MaxSize() in case it is not full)
    public int NumberOfActualElements()
    {
        if (m_Head_ >= m_Tail_)
            return (int)m_Head_ - (int)m_Tail_;
        int maxSize = m_Queue_.Length - 1;
        return (int)(maxSize - m_Tail_ + m_Head_ + 1);
    }

    // Check if the queue is empty or full:
    public bool IsEmpty() { return (m_Head_ == m_Tail_); }
    public bool IsFull() { return (_Next(m_Head_) == m_Tail_); }

    // Push a new element to the queue. If the queue is full the oldest element is discarded to keep the size.
    public void Push(bool elem)
    {
        if (IsFull())
        {
            m_Tail_ = _Next(m_Tail_);
        }
        m_Queue_[(int)(m_Head_)] = elem;
        m_Head_ = _Next(m_Head_);
    }

    // Access element by index:
    // NOTE:    Q[0]                is Tail() (i.e. the oldest)
    //          Q[NumElements-1]    is Head() (i.e. the newest)
    public bool this[int index]
    {
        get { return m_Queue_[(int)((index + m_Tail_) % m_Queue_.Length)]; }
        set { m_Queue_[(int)((index + m_Tail_) % m_Queue_.Length)] = value; }
    }

    // Get common value:
    public bool GetCommonValue()
    {
        int numTrue = 0;
        int numFalse = 0;
        int numElems = this.NumberOfActualElements();
        for (int i = 0; i < numElems; ++i)
        {
            if (this[i])
            {
                numTrue++;
            }
            else
            {
                numFalse++;
            }
        }
        return (numTrue > numFalse);
    }
    

    protected int _Next(int i) { return (i + 1) % m_Queue_.Length; }
    protected int _Prev(int i) { return (i + m_Queue_.Length - 1) % m_Queue_.Length; }

    protected bool[] m_Queue_;
    protected int m_Head_;
    protected int m_Tail_;
}



class Program
{
    static void Main(string[] args)
    {
        CyclicQueueBool q = new CyclicQueueBool(5);
        q.Push(false);
        q.Push(true);
        q.Push(false);
        q.Push(false);
        q.Push(true);
        Console.WriteLine("GetCommonValue: " + q.GetCommonValue());
    }
}

UPDATE: based on @Ben Voigt's comment I replaced List<bool> in the implementation with bool[].

wohlstad
  • 12,661
  • 10
  • 26
  • 39
  • This unnecessarily has `O(N)` performance on `GetCommonValue()`. – Ben Voigt Apr 14 '22 at 16:55
  • @Ben Voigt you are right. But I assumed the size of the window (N) is quite small, and optimized for frequent insertions. – wohlstad Apr 14 '22 at 16:57
  • Also `List` is getting in your way, if you want a random-access collection of fixed size that starts filled with default values, `T[]` does that. – Ben Voigt Apr 14 '22 at 16:59
  • I thought `List` is implemented internally as an array (i.e. continous memory), unlike `LinkedList`. And since I allocated it all in the constructor I didn't expect any additional cost. Am I wrong ? – wohlstad Apr 14 '22 at 17:01
  • You're not wrong, but you don't use anything that `List` adds on top of a plain array, and in fact you had to write extra code to work around the separation of size and capacity. The extra features are being harmful with no benefit. – Ben Voigt Apr 14 '22 at 17:03
  • I agree. I changed my answer accordingly. – wohlstad Apr 14 '22 at 17:14