1

I'm wondering what the best implementation would be for a game that makes continuous checks if a given user word is in dictionary. The dictionary itself has ~220,000 words.

I'm looking for a solution that uses minimal available resources for if possible a way to reduce the size of the dictionary (right now its a .txt file ~1.2Mb large)

My current solution is to have 26 seperate files (beginning a-z) and load these each into an array (which will at max hold a size of about ~15,000 strings).

A second idea would be to use a trie-tree structure rather than array.

A database seems like it would take up far too much space.

Edit: I also need to be able to check if there are any words that begin with those letters available in the dictionary.

Ex. word is sea. Yes it is in the dictionary, and yes there are other words that begin with sea.

Ex 2. word is pov. No it is not in the dictionary, yes there are other words that begin with pov

Matt Stokes
  • 4,618
  • 9
  • 33
  • 56
  • Try using a HashMap. http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html. I believe a trie takes massive amounts of memory. – Voicu Jul 13 '13 at 18:20
  • **I updated my requirements a litle. A trie would take more memory than a hasmap but I need the functionality of searching for other existing words as well. I don't believe you can use a HashMap with a regex expression can you? – Matt Stokes Jul 13 '13 at 18:26
  • Some useful info: http://stackoverflow.com/questions/879807/java-search-in-hashmap-keys-based-on-regex, although these solutions use iteration, which beats the purpose of a hash map. – Voicu Jul 13 '13 at 18:29
  • You can implement your own `Map` that supports regexes wrapping a `HashMap`. – m0skit0 Jul 13 '13 at 18:35
  • lol yes it does. So the answer for best implementation is then an original text file containing 220,000+ words (alphabetical and non capitalized) which I would subdivide into my 26 separate files (those of which are loaded into my android project resources) which I then load into an array then iterate over the array? – Matt Stokes Jul 13 '13 at 18:36
  • 2
    DB should be fine. Have you tried with DB and found that performance is bad? What DB design are you using? SQLite DB is not going to take much more space than the files. – m0skit0 Jul 13 '13 at 18:38
  • 3
    You should use a sorted ArrayList and use binary search to find the word or the first word after it in order. You can then go down the list until the first word having another beginning than the searched one. Alternatively a Sqlite database may use more mass storage memory but less main memory (which is normally more precious) – Michael Butscher Jul 13 '13 at 18:40
  • 2
    I agree with m0skit0 - using an SQLite DB would be easiest: http://developer.android.com/guide/topics/data/data-storage.html#db . What are your requirements for keeping the entire dictionary in memory? Why not keep it on storage and let the dictionary libs do the heavy lifting, instead of re-implementing what you need? Implement first and then benchmark and see how it works for you. Don't fall into the trap of premature optimization. – Roshan Jul 13 '13 at 19:00
  • @Roshan the implementation is such that I will be performing a large number of searches on the dictionary from words beginning with a set letter. SO the thought was to extract those entries beginning with that letter from the dictionary to speed access as performing this many queries surely would slow performance. – Matt Stokes Jul 13 '13 at 19:09
  • My vote is for DB too, especially given the power of LIKE operator – Wand Maker Jul 13 '13 at 19:14
  • You can also index your target column in db, this should speed up the performance. – Desert Jul 13 '13 at 21:30

0 Answers0