my application reads bigram collocation (pairs) from a .txt file. they are to be read as key-value pairs. a single key can have multiple values (So, any kind of a Map as a data structure is ruled out)...I want to keep them sorted, in natural alphabetical order..
first word of collocation i.e. key will be a verb and its value will contribute to a verb-word kind of a collocation..So, trees can be consideration
So, essentially I am trying to implement a
SortedList <String, String>
kind of a thing..
I have come across following data structures that suit my requirement, although I am unable to decide which one to use: (the MultiMap mentioned here are a part of google's collections framework)
Tries - i know only the basics of this data structure. I found one implementation of it in Java here . It does not implement delete() operation.
or any other data structure you would like to recommend? I havent gone through the Dictionary in Java yet...Please help me decide which one should I choose...
Thanks!
EDIT - the list is expected to contain about 100-200 entries
EDIT2: Operations: searching if a key-value mapping exists for a given key..as i said before, the dst will store a list of verb-word pairings as key-value entries; it is initialized by reading entries from a file...the working goes something like this: we first get all keys from the dst...read a file and tokenize it (done thru OpenNLP, dst not for this)..and then search if the any of the tokens macthes a key (i.e. is a verb) in the dst......once found, we get all values for the given key, and search the next token within the set of values...if the value is also found in the dst, it means a collocation is detected..appropriate values are set then...THIS IS HOW THE DST SHOULD ACTUALLY WORK...