-2

I would like to search in a set of sets in a specific way:

Example (Pseudocode):

search = {{1}, {3}}
search_base = {{1, 2}, {3, 4}}
# search results in True

because the 1 can be found in the first set and the 3 in the second. Order doesn't matter, but the number of subsets has to be identical, the search consists always of singletons.

Example (Intuition):

search = {{"Vitamin D"}, {"Sodium"}}
search_base = {{"Vitamin D", "Vitamin"}, {"Sodium", "NA"}}

I want to know if search (a combination of two materials with different hierachical names) is in the search base. The search base here only contains a single entry.


What I tried:

Using frozensets instead if sets for the hash.

search = frozenset([frozenset([1]), frozenset([3])])
search_base = frozenset([frozenset([1, 2]), frozenset([3, 4])])

all_matched = []
for i, set_ in enumerate(search):
    for bset_ in search_base:
        if not set_.isdisjoint(bset_):
            all_matched.append(True)
print(len(all_matched) == len(search))

It feels very clunky and therefore my question is if there is a much smarter (better performance) way to solve this.

Andreas
  • 8,694
  • 3
  • 14
  • 38
  • 1
    What is being searched, and for what? search and search_base don't really explain it for me. – President James K. Polk Nov 14 '22 at 01:11
  • 1
    I don't get the logit either. What if `search = {{1}, {2}}` and `search_base = {{1,2},{3,4}}` are given? What if the subsets of `search` are non-singleton, e.g., `search = {{1,2,3}, {4}}` and `search_base = {{1,2}, {3,4}}`? – j1-lee Nov 14 '22 at 01:12
  • @PresidentJamesK.Polk added an example. basically I want to know if for the search set there is another set in the search_base which shares an element for each subset. – Andreas Nov 14 '22 at 01:19
  • Are the sets in `search` always singletons? – blhsing Nov 14 '22 at 01:21
  • @j1-lee added an intuitive example and another comment. For your eamples, #1 would result False, #2 would result true – Andreas Nov 14 '22 at 01:21
  • @blhsing yes, always singletons – Andreas Nov 14 '22 at 01:22
  • So you should mention that about the #2 example from @j1-lee instead of saying it would return `True`. – blhsing Nov 14 '22 at 01:23
  • @blhsing True, added it to the question. – Andreas Nov 14 '22 at 01:25
  • @j1-lee the datatype isn't important, the frozensets are the result of me trying to solve the problem, not a specification, I thought they make the most sense. – Andreas Nov 14 '22 at 01:27
  • Can the sets in `search_base` overlap? What should be the result if `search = {{1}, {3}}` and `search_base = {{1, 2}, {1, 4}}`? With your implementation it returns `True`, but is it correct? – blhsing Nov 14 '22 at 01:29
  • @blhsing result should be False, because if you look at my "inutitive" example it would mean that 1=Vitamin D occures 2x where the 3=Natrium is missing in the search_base. – Andreas Nov 14 '22 at 01:32
  • Exactly, which is why I brought it up because it means your current implementation does not exactly work as intended and cannot be used as a reference solution. – blhsing Nov 14 '22 at 01:34
  • @blhsing, ahhh you are right. My bad. I haven't consider that yet, thanks for the hint! – Andreas Nov 14 '22 at 01:39
  • And what if the items in `search` do not cover every set in `search_base`, but are all *covered* by at least one set in `search_base`? For example, your implementation would return `True` for `search = {{1}, {2}}` and `search_base = {{1, 2}, {3, 4}}`, but is it correct? – blhsing Nov 14 '22 at 01:51
  • @blhsing in this case it would also be False, Intuitivly spoken: the sodium would be missing. – Andreas Nov 14 '22 at 02:04
  • Aside from the needless use of `enumerate`, this doesn't seem clunky at all (other than the clunky state of affairs that arises from having a bunch of singleton sets). – juanpa.arrivillaga Nov 14 '22 at 02:35

2 Answers2

2

As you said the data type is not important, I will just use a list of numbers and a list of sets. You can nest all and any, with in to check set membership:

def search(nums, sets):
    return all(any(x in y for y in sets) for x in nums)

print(search([1, 3], [{1, 2}, {3, 4}])) # True
print(search([1, 2], [{1, 2}, {3, 4}])) # True
print(search([1, 5], [{1, 2}, {3, 4}])) # False
print(search(["Vitamin D", "Sodium"], [{"Vitamin D", "Vitamin"}, {"Sodium", "NA"}])) # True

As for performance, this does not improve upon your current code. I doubt there are any (performance-wise) better ways unless there are more restrictions on the input data.

j1-lee
  • 13,764
  • 3
  • 14
  • 26
0

For future readers, I found another solution, where we use the cartesian product to create a tuple of ground truth and check if the search set is in there:

search = {1, 3}
search_base = ({1, 2}, {3, 4})

from itertools import product
new_search_base = tuple(set(x) for x in product(*search_base))
print(search in new_search_base)
Andreas
  • 8,694
  • 3
  • 14
  • 38
  • The creation of `new_search_base` has a time complexity of *O(2 ^ n)*, which can get very costly quickly if/when *n* gets large. – blhsing Nov 15 '22 at 09:54
  • @blhsing that is true, however for my usecase it is fine. Thats also the reason I accepted the other answer of j1-lee and not this one, but I wanted to add it in case future readers have a similar use case as I. – Andreas Nov 15 '22 at 13:30