-1

I have a large file, each row can be process separately, so I launch one reader, and multiple parsers.

The each parser will write result back to a result holder array for further process.

I found if I launch more parser, the result holder array gives different content each time, no matter if I use ConcurrentQueue or BlockingCollection or some other things

I repeatedly run the program and output the result array many times, each time will give different if I use more than 1 parsers.

string[] result = new string[nRow];
static BlockingCollection<queueItem> myBlk = new BlockingCollection<queueItem>();

static void Main()
{
    Reader();
}

static void parserThread()
{
    while (myBlk.IsCompleted == false)
    {
        queueItem one;

        if (myBlk.TryTake(out one) == false)
        {
            System.Threading.Thread.Sleep(tSleep);
        }
        else
        {
            oneDataRow(one.seqIndex, one.line);
        }
    }
}

static void oneDataRow(int rowIndex, string line)
{
    result[rowIndex] = // some process with line
}

static void Reader()
{
    for (int i = 0; i < 10; i++)
    {
        Task t = new Task(() => parserThread());
        t.Start();
    }

    StreamReader sr = new StreamReader(path);
    string line;
  
    int nRead=0;
    while((line = sr.ReadLine()) != null)
    {
        string innerLine = line;
        int innerN = nRead;
        myBlk.Add(new queueItem(innerN, innerLine));
        nRead++;
    }
    siteBlk.CompleteAdding();
    sw.close();

    while (myBlk.IsCompleted == false)
    {
        System.Threading.Thread.Sleep(tSleep);
    }
}

class queueItem
{
    public int seqIndex = 0;
    public string line = "";
    public queueItem(int RowOrder, string content)
    {
        seqIndex = RowOrder;
        line = content;
    }
}
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Benny Ae
  • 1,897
  • 7
  • 25
  • 37
  • You could try replacing the `result[rowIndex] = // some process` with `result[rowIndex] = rowIndex.ToString();`, in order to exclude the possibility that the *"some process with line"* has anything to do with the problem. – Theodor Zoulias Nov 03 '20 at 04:32
  • When you initialize the array with `string[] result = new string[nRow];`, how do you know the value of `nRow`? – Theodor Zoulias Nov 03 '20 at 04:33

1 Answers1

0

The way you are waiting for the process to complete is problematic:

while (myBlk.IsCompleted == false)
{
    System.Threading.Thread.Sleep(tSleep);
}

Here is the description of the IsCompleted property:

Gets whether this BlockingCollection<T> has been marked as complete for adding and is empty.

In your case the completion of the BlockingCollection should not signal the completion of the whole operation, because the last lines taken from the collection may not be processed yet.

Instead you should store the worker tasks into an array (or list), and wait them to complete.

Task.WaitAll(tasks);

In general you should rarely use the IsCompleted property for anything other than for logging debug information. Using it for controlling the execution flow introduces race conditions in most cases.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104