I have the following code:
List<HashSet<String>> authorLists = new List<HashSet<String>>
// fill it
/** Remove duplicate authors */
private void removeDublicateAuthors(HashSet<String> newAuthors, int curLevel)
{
for (int i = curLevel - 1; i > 0; --i)
{
HashSet<String> authors = authorLists[i];
foreach (String item in newAuthors)
{
if (authors.Contains(item))
{
newCoauthors.Remove(item);
}
}
}
}
How to remove items correctly? I need to iterate through newAuthors and authorLists. RemoveWhere cannot be used here by this reason.
It is very inefficient to create new list, add items to them and then remove duplicate items. In my case, list of authorLists has following sizes:
authorLists [0].size = 0;
authorLists [1].size = 322;
authorLists [2].size = 75000; // (even more than this value)
I need to call removeDublicateAuthors 1*(1)322(n)75000(m) times where n and m are the sizes of duplicate authors on the 1st and 2nd levels correspondingly. I have to delete these items very often and the size of array is very large. So, this algorithm is very inefficient. Actually I have the following code in Java and to rewrite it by some reasons:
/** Remove duplicate authors in tree of Authors*/
private void removeDublicateAuthors(HashSet<String> newCoauthors, int curLevel ) {
for(int i = curLevel - 1; i > 0; --i) {
HashSet<String> authors = coauthorLevels.get(i);
for (Iterator<String> iter = newCoauthors.iterator(); iter.hasNext();) {
iter.next();
if(authors.contains(iter)) {
iter.remove();
}
}
}
}
It works much faster than suggested options at the moment