1

Context I have a script, which generates objects of the same class (this class is called State, for reference). Each object holds some value and the goal of the objects is to store and move values around. The other attributes of the objects are determined by a nested dictionary, such that the dictionary:

dictionary = {
    "foo":[1,2],
    "bar":{
        "baz":[3],
        "qux":[4]
        }
}

would produce 4 objects with the following attributes

self.foo = 1, self.bar.baz = 3, self.value = 0
self.foo = 1, self.bar.qux = 4, self.value = 0
self.foo = 2, self.bar.baz = 3, self.value = 0
self.foo = 2, self.bar.qux = 4, self.value = 0

(more information on this process here: Combination of nested dictionaries with arbitrary lengths in python)

Problem I need to find some way of finding all the objects that have a certain attribute, or set of attributes. For example, all the objects with the attribute self.bar.baz may need to take on a certain value, and self.bar.qux need to take on some other value.

One potential solution One solution, is to create a separate object (from the Environment class, for reference), which is responsible for creating a list of all instantiated objects.

The simplest solution, is append all the objects to a list, and then loop through that list, checking which objects have a set of attributes. However, this gets inefficient very quickly, especially considering the set of objects is typically much more complex than the example above.

A slightly more advanced solution, is for the Environment object to have all possible attributes e.g.

self.foo
self.bar.baz
self.bar.qux

And the values of these attributes are lists, containing all objects with these attributes. This way, there is no need to filter through the list objects, and objects with a set of attributes can be filtered by returning the set of objects in two or more lists.

However, I'm not sure:

  • What the code for this would actually look like?
  • If there is a more efficient solution?

Additional constraints Importantly, the attributes that State objects have, are not fixed. That is, it may not be foo and bar (or baz or qux), but may be completely different keys. Therefore, I cannot hardcode the attributes to search for.

rorance_
  • 349
  • 1
  • 10

1 Answers1

1

Here is a proposal :

# creating data to use ...

class State:
    """Dummy class for State, the `name` member and the `__repr__` are for visualization purposes"""
    def __init__(self, name):
        self.name = name

    def __repr__(self) -> str:
        return self.name


# creating fake data (according to the structure you provided, though)
s1, s2, s3, s4 = State("S1"), State("S2"), State("S3"), State("S4")
s1.foo = 1; s1.bar = State(""); s1.bar.baz = 3; s1.value = 0
s2.foo = 1; s2.bar = State(""); s2.bar.qux = 4; s2.value = 0
s3.foo = 2; s3.bar = State(""); s3.bar.baz = 3; s3.value = 0
s4.foo = 2; s4.bar = State(""); s4.bar.qux = 4; s4.value = 0

states = [s1, s2, s3, s4]


# now solving the problem ...

def recursively_has_attr(obj, attrs: str) -> bool:
    """Iterate over an object members to check whether is has an 'a.b.c' attribute."""
    for attr_name in attrs.split("."):
        if hasattr(obj, attr_name):
            obj = getattr(obj, attr_name)
        else:
            return False
    else:
        return True


# we are going to create customized versions of the function `recursively_has_attr`
# by pre-filling its `attrs` parameters, using `functools.partial`
from functools import partial
matching_foo = partial(recursively_has_attr, attrs="foo")
matching_bar_baz = partial(recursively_has_attr, attrs="bar.baz")
matching_bar_qux = partial(recursively_has_attr, attrs="bar.qux")
matching_bar_zod = partial(recursively_has_attr, attrs="bar.zod")

print(f"matching foo : {tuple(filter(matching_foo, states))}")
print(f"matching bar.baz : {tuple(filter(matching_bar_baz, states))}")
print(f"matching bar.qux : {tuple(filter(matching_bar_qux, states))}")
print(f"matching bar.zod : {tuple(filter(matching_bar_zod, states))}")

user_supplied_attrs = input("Enter the attribute you want to search : ")
filtering_function = partial(recursively_has_attr, attrs=user_supplied_attrs)
print(f"matching {user_supplied_attrs!r} : {tuple(filter(filtering_function, states))}")

produces :

matching foo : (S1, S2, S3, S4)
matching bar.baz : (S1, S3)
matching bar.qux : (S2, S4)
matching bar.zod : ()
Enter the attribute you want to search : bar
matching 'bar' : (S1, S2, S3, S4)

This works, other approaches can too.

As for being efficient, is it really a requirement ?
If yes, what do you mean by efficient ? Ii it memory-wise, speed-wise, line-of-code-wise, complexity-wise ?
If you want very good performance, would you consider using a lower-level language ? Cython for having "true" arrays ?

Test my approach on your data, and if it is not sufficient, please post a question with a clear goal :)

Lenormju
  • 4,078
  • 2
  • 8
  • 22
  • Apologies for not clearly specifying the goal, and I appreciate the response. – rorance_ Jan 15 '22 at 08:22
  • Apologies for not clearly specifying the goal, and I appreciate the response. This solution is very close to what is needed. However, I would also need to match the values of attributes. e.g. `partial(recursively_has_attr, attrs="bar.baz.3"` In the real example, the values of these attributes are strings. e.g. self.disease = "breast cancer", or self.disease = "diabetes". So the function would need to be able to find all objects which had disease = "diabetes", or disease = "breast cancer", and not just all objects with the attribute "disease". – rorance_ Jan 15 '22 at 08:32
  • Do you control the input format ? It would be simpler and cleaner to have something like `bar.baz=3` because it makes clear which are attributes and what is the expected value. On the contrary, having `bar.baz.3` you have to wonder if `3` is the name of an attribute that must exist or if `3` is the value of the `baz` attribute ? Or both could be possible ? – Lenormju Jan 16 '22 at 10:18