I would like to make the following code thread-safe. Unfortunately, I have tried locking at various levels within this code with no success. The only instance I can seem to achieve thread-safety is to place a lock around the entire loop which effectively makes the Parallel.ForEach no faster (possibly even slower) than just using foreach. The code is relatively/almost safe with no locking. It only seems to show slight variations in the summation of the geneTokens.Value[-1] keys and gtCandidates.Value[-1] keys about once out of every 20 or so executions.
I realize that Dictionary is not thread-safe. However, I cannot change this particular object to a ConcurrentDictionary without taking a major performance hit downstream. I would rather run this part of the code with a regular foreach than change that particular object. However, I am using ConcurrentDictionary to hold the individual Dictionary objects. I have also tried making this change, and it does not solve my race issue.
Here are my Class level variables:
//Holds all tokens derived from each sequence chunk
public static ConcurrentBag<sequenceItem> tokenBag =
new ConcurrentBag<sequenceItem>();
public BlockingCollection<sequenceItem> sequenceTokens = new
BlockingCollection<sequenceItem>(tokenBag);
public ConcurrentDictionary<string, int> categories = new
ConcurrentDictionary<string, int>();
public ConcurrentDictionary<int, Dictionary<int, int>> gtStartingFrequencies = new
ConcurrentDictionary<int, Dictionary<int, int>>();
public ConcurrentDictionary<string, Dictionary<int, int>> gtCandidates = new
ConcurrentDictionary<string, Dictionary<int, int>>();
public ConcurrentDictionary<string, Dictionary<int, int>> geneTokens = new
ConcurrentDictionary<string, Dictionary<int, int>>();
Here is the Parallel.ForEach:
Parallel.ForEach(sequenceTokens.GetConsumingEnumerable(), seqToken =>
{
lock (locker)
{
//Check to see if the Sequence Token is a Gene Token
Dictionary<int, int> geneTokenFreqs;
if (geneTokens.TryGetValue(seqToken.text, out geneTokenFreqs))
{ //The Sequence Token is a Gene Token
*****************Race Issue Seems To Occur Here****************************
//Increment or create category frequencies for each category provided
int frequency;
foreach (int category in seqToken.categories)
{
if (geneTokenFreqs.TryGetValue(category, out frequency))
{ //increment the category frequency, if it already exists
frequency++;
geneTokenFreqs[category] = frequency;
}
else
{ //Create the category frequency, if it does not exist
geneTokenFreqs.Add(category, 1);
}
}
//Update the frequencies total [-1] by the total # of categories incremented.
geneTokenFreqs[-1] += seqToken.categories.Length;
******************************************************************************
}
else
{ //The Sequence Token is NOT yet a Gene Token
//Check to see if the Sequence Token is a Gene Token Candidate yet
Dictionary<int, int> candidateTokenFreqs;
if (gtCandidates.TryGetValue(seqToken.text, out candidateTokenFreqs))
{
*****************Race Issue Seems To Occur Here****************************
//Increment or create category frequencies for each category provided
int frequency;
foreach (int category in seqToken.categories)
{
if (candidateTokenFreqs.TryGetValue(category, out frequency))
{ //increment the category frequency, if it already exists
frequency++;
candidateTokenFreqs[category] = frequency;
}
else
{ //Create the category frequency, if it does not exist
candidateTokenFreqs.Add(category, 1);
}
}
//Update the frequencies total [-1] by the total # of categories incremented.
candidateTokenFreqs[-1] += seqToken.categories.Length;
*****************************************************************************
//Only update the candidate sequence count once per sequence
if (candidateTokenFreqs[-3] != seqToken.sequenceId)
{
candidateTokenFreqs[-3] = seqToken.sequenceId;
candidateTokenFreqs[-2]++;
//Promote the Token Candidate to a Gene Token, if it has been found >=
//the user defined candidateThreshold
if (candidateTokenFreqs[-2] >= candidateThreshold)
{
Dictionary<int, int> deletedCandidate;
gtCandidates.TryRemove(seqToken.text, out deletedCandidate);
geneTokens.TryAdd(seqToken.text, candidateTokenFreqs);
}
}
}
else
{
//create a new token candidate frequencies dictionary by making
//a copy of the default dictionary from
gtCandidates.TryAdd(seqToken.text, new
Dictionary<int, int>(gtStartingFrequencies[seqToken.sequenceId]));
}
}
}
});