What Kind of algorithms + data structures that would help me to do that?
Having a file contains like 10000~ lines loaded in memory in a ordered set. With a given search string I want to be able to get all the lines that have words prefixed with words found in search string. Well let me give an example to clarify this:
Lines:
- "A brow Fox flies."
- "Boxes are full of food."
- "Cats runs slow"
- "Dogs hates eagles"
- "Dolphins have eyes and teath"
Cases 1:
search string = "fl b a"
"A brow Fox flies."
- Explanation: search string have three words "fl", "b", and "a" and the only string that have some words that are prefixed with words from the search string is line 1.
Case 2:
search string "e do ha"
"Dogs hates eagles", "Dolphins have eyes and teath"
Solution
(fast enough for me it took about 30ms~(including sorting the final result) on my pc on a set of 10k lines 3 words each line)
- I used a trie suggested in answer.
- And some other hacky methods to be able to filter out duplicate and false positive results (mainly used hash sets for this).