0

My application handles some texts parsing and uses a proper noun cache to reduce database calls:

Dictionary<String, ProperNoun> ProperNounsDict;

if (!ProperNounsDict.ContainsKey(word))
{
    var newProper = new ProperNoun() { Word = word  };
    ProperNounsDict.Add(word, newProper);

    UnitOfWork.ProperNounRepository.Insert(newProper);
    try
    {
        UnitOfWork.SaveChangesEx();
    }
    catch (Exception ex)
    {
         // 
    }
}

The problem is that database and C# treat equality of strings in a different way, so I can run into duplicate key error (SQL) for similar words:

1) Database (SQL Server 2014)

Column_name  Type       Collation
Word         nvarchar   Latin1_General_100_CS_AS

Saevarsson and Sævarsson are the same thing from the database perspective and it is fine for me, since words containing characters æ are very rare in parsed texts:

select * from dict.ProperNoun where Word = N'Saevarsson'  -- returns both Saevarsson and Sævarsson

2) C#

string s1 = "Sævarsson";
string s2 = "Saevarsson";
bool equals = s1.Equals(s2, StringComparison.InvariantCulture);

s1 and s2 are seen as equal, if comparison is done in an InvariantCulture way

Question: is there a way to check for a string key existence in an InvariantCulture way? I do not want to loose my O(1) complexity of checking for key existence, if possible.

Things I have tried:

a) Database check - for cache misses, before inserting into the cache, also check in DB. Generates a lot of queries, so performance is awful

b) String normalization - replace undesired characters with "normal" ones using a map similar to this one. Requires a lot of work and I feel it can be automated since StringComparison.InvariantCulture knows how to deal with this.

Thanks.

Community
  • 1
  • 1
Alexei - check Codidact
  • 22,016
  • 16
  • 145
  • 164

2 Answers2

4

When you initialize your dictionary, you can use constructor with IEqualityComparer<TKey>:

Dictionary<String, ProperNoun> ProperNounsDict = 
    new Dictionary<String, ProperNoun>(StringComparer.InvariantCulture);

In this case your keys will be compared using invariant culture. You can use other string comparers as well, depending on your needs.

dotnetom
  • 24,551
  • 9
  • 51
  • 54
  • Yes. It is exactly what I need. In my case the initialization is done from database: `UnitOfWork.ProperNounRepository.AllNoTracking.ToDictionary(pn => pn.Word, pn => pn, StringComparer.InvariantCulture));`. Thank you. – Alexei - check Codidact Jun 19 '16 at 05:34
2

Use this constructor for creating the dictionary.

Dictionary<String, ProperNoun> ProperNounsDict = new Dictionary<String, ProperNoun>(StringComparer.InvariantCulture);
M.kazem Akhgary
  • 18,645
  • 8
  • 57
  • 118