0

I am trying to solve a algorithm problem in efficient way. Below is problem

I am going to explain this problem taking character as an example but in reality it could be anything (char, int, string, double, object etc). I doubt that should/would make any difference though

I have list of rows containing set of characters, let's say below are six lists

1 - A, B, C
2 - B, D
3 - E
4 - A
5 - F, B
6 - F, C

Now given a user input we want to find union of all subset of given input whose any combination can exactly match any of above rows. I will explain by giving example as below

Input Case 1 - A, C

Now in this case we have input as A, C so we will try to match if there was any row that contains exactly A, C and we won't find any. Next we will try to do find any row with A only and we find there is 1 and then we will try to find row for C only and we won't find so the output in this case would be A

Output - A

Input Case 2 - F, B, C

Now in this case we will see that there is no row containing only F, B, C so we will try combination i.e. F,B & we will there is row number 5 for it then F, C and we will see we have row number 6 for it. Considering all elements are covered now for this input we don't need to continue further but in case any element would have been left uncovered we will need to see other combinations as well (like B, C & then B & then C & then F)

Output - F, B, C

Input Case 3 - L, B, C

Now in this case we can see that there is no combination of above elements that can match any row hence output is Null

Output - Empty Set

Input Case 4 - F, B, D, C

Now in this case we see there is no row containing all elements so we try to see if there is some row matching F,B,D or B,D,C or F,C,D etc... and continuing similarly we will find output would be F, B, D, C (There is row for F,B & F, C & B, D which covers all elements)

Output - F, B, D, C

I am looking for an efficient way to get output. I can store data in any way (set/map/multiindex etc)

Blackhole
  • 273
  • 2
  • 12
  • And how did you try to solve it? Where did you get stuck? We won't solve your problems for you without any sign of effort on your side... – Jaa-c Mar 31 '14 at 02:23
  • I tried using set to keep all rows (rows always have unique elements in my case), then for a given input I tried finding all combination in that set until either all combinations are exhausted or union of combination that matched have covered all elements in given input. This works but is not efficient specially when there is lot of data to compare. I am trying to think of other solutions as well but couldn't come up with anything better yet. – Blackhole Mar 31 '14 at 03:03
  • @Jaa-c - Considering I have explained that there was effort done to find a solution, I would request you to please remove down vote you gave. I am sorry I didn't mentioned it in question itself. – Blackhole Mar 31 '14 at 03:05
  • I have partially understood but trying to clarify to be sure. Taking your example I will still need to consider all combinations i.e in this case {A, C, D} (No intersection found) {A, C} ({1} intersection found but all rows found has elements not equal to number of elements in query i.e. 3) {A, D} {C, D} {A} {C} {D} until in this case final answer comes as A when all combination are exhausted. Is that what you are proposing or I am missing something. – Blackhole Mar 31 '14 at 05:50
  • Just to add, in my case input could have lot of characters so finding combinations and then trying each is something I want to get away from. – Blackhole Mar 31 '14 at 05:53
  • The solution is to map each character to the set of rows that contain it. You can then use this to create a list of rows that contain the query characters. The most straightforward and generally efficient way would be to union the row sets for each character like "char_to_row[A] | char_to_row[C] | char_to_row[D] = {1,2,4,6}. Then for each row you do a set intersection between the set of characters in the row and the set of query characters. If the size of the row is equal to the size of the intersection, you include that row number. For the example, this is only the case for row 4. – Nuclearman Mar 31 '14 at 06:54
  • Basically, the same as the previous solution I gave, but I noticed that you didn't need to work on the query side. In the worst case, you can simply intersect the set of elements in each row with the set of elements in the query. If the result has the same size as the row, then that row is valid. The union approach can be faster than checking each row though. – Nuclearman Mar 31 '14 at 07:03

0 Answers0