-1

Available sets are

A={"one","two","three"}
B={"two","three","four"}
c={"four","five"}

Given set is

D = {"four","five","six"}

The task is to find which available set has most intersecting elements to given set.

Here
C contains 2 fields of D
B contains 1 field of D.
This can be computed by finding the union of D with A, B, C.

How to find the most close-set when there are millions of available sets.

vasanths294
  • 1,457
  • 2
  • 13
  • 29

1 Answers1

1

Build a data structure in such a way that the elements become the key. In your example, the data structure can be built to look like the below

"one": {A}
"two": {A,B}
"three": {A,B}
"four": {B,C}
"five": {C}

Now all you need to check is to take each element in your input set D and add a counter to each of the set names. so in your example, D will be {"four","five","six"}

Now you can loop through "four", "five" and "six"

Step 1: The counter will be all zeros initially  

Step 2: After looking at the values for "four" the counter will look like below  
B:1, C:1  

Step 3: After looking at the values for "five" the counter will look like below  
B:1, C:2  

Step 4: After looking at the values for "six" the counter will look like below   
B:1, C:2  

Step 5: Choose the set with the maximum value. In this case it will be C.  

If you are using python, you can use collections.Counter most_common method.
https://docs.python.org/3/library/collections.html#collections.Counter

Kannappan Sirchabesan
  • 1,353
  • 11
  • 21