I have a pretty small collection of string values in memory (around 8400 records with an average of 10 words each):
What I'm trying to find out if there's a library or something that, when I search for strings within that collection, it returns the matching values according to it, and it could also include some kind of weight to the results.
This is what I'm trying to do; let's say that I have these records in a List in memory:
- Department Store General Manager
- General and Operations Manager
- General Manager
- Restaurant Generally Managers
- Restaurant General Manager
Let's say that I'm working on a method that receives a search string and it will analize that collection in order to retrieve the results:
List<string> SearchJotitles("General Manager")
I want something that will return all the records that contain the words General AND Manager. So far it should be easy: I could do it with regular expressions.
But the tricky part is that I want to apply some weighing rules saying :
"OK: the third record is a bigger match cause it's an EXACT match." "The first and last record should be next cause they have the two words with no distance between them". "The second record should be next cause it has the two exact words but in different order" "The 4th record should be last cause it has a partial match of both words"
THAT's the kind of logic I want to apply.
I know there are some libraries like Lucene.NET or Sphinx: I'm not discarding them; I'm just not convinced if they're worth using for such small in-memory collection.
In the worst-case scenario, I'll work in a IComparer implementation of the entities, but I want to know if there's something I could already use out there.
Thanks and regards,