0

While working on a difference engine to identify differences in very large data structures, I noticed that a type comparison between identical-but-redeclared namedtuples misbehaves. Redeclaring the namedtuples is unavoidable*. Here is a minimal example:

def test_named_tuples_same_type():
    from collections import namedtuple

    X = namedtuple("X", "x")
    a = X(1)

    # We are unable to avoid redeclaring X
    X = namedtuple("X", "x")
    b = X(1)

    print(repr(a))
    print(repr(b))
    # X(x=1)
    # X(x=1)

    assert isinstance(type(a), type(b))  # fail
    assert type(a) == type(b)  # fail

The asserts return:

>       assert isinstance(type(a), type(b))  # fail
E       AssertionError: assert False
E        +  where False = isinstance(<class 'tests.test_deep_diff.X'>, <class 'tests.test_deep_diff.X'>)
E        +    where <class 'tests.test_deep_diff.X'> = type(X(x=1))
E        +    and   <class 'tests.test_deep_diff.X'> = type(X(x=1))

and

>       assert type(a) == type(b)  # fail
E       AssertionError: assert <class 'tests.test_deep_diff.X'> == <class 'tests.test_deep_diff.X'>

How to assert the type of both are equal or semantically equal (without str(type()))?

*Redeclaring the namedtuple is unavoidable because it takes place in unmodifiable exec'd code to generate the data structures being diffed.

Drakes
  • 23,254
  • 3
  • 51
  • 94
  • Thinking to myself: I wonder how to display just how the types are different? – Drakes Sep 26 '21 at 23:17
  • 1
    "Redeclaring the namedtuples is unavoidable*." Highly skeptical of that claim. In any case, *stop using the type to compare* because *they are of different types*. The simplest solution would be *not to make them have different types*, but since you (claim) that is unavoidable, you are going to have to choose some way. First, you should define what you mean by "semantically equal". Then implement that logic in code. – juanpa.arrivillaga Sep 27 '21 at 00:42
  • And, there is no misbehavior here. This is all quite expected from the basic semantics of Python's type system. – juanpa.arrivillaga Sep 27 '21 at 00:42
  • It is unavoidable. It's not laziness, it is unavoidable whether it is believed or not (and you missed the * at the bottom of the question, btw). _Semantically equal_ should be obvious from the phrase, but to be explicit, both named tuples have the same module, are named `X`, and take a parameter `x` - semantically equally: a superset of Pythonically-equal it appears. Given this information, do you have a helpful solution you'd like to share? – Drakes Sep 27 '21 at 00:48
  • Well, for example, the fact that they are supposed to have the same module... what does that mean? That the classes were defined in the same module? So `X = namedtuple("X", "x")` in module `foo` should not be the same "semantic type" as `X = namedtuple("X", "x")` defined in module `bar`? – juanpa.arrivillaga Sep 27 '21 at 00:57
  • In any case, you probably want to rely on the `.__name__` of the type, and it's `._fields`. If the module matters, then `.__module__` – juanpa.arrivillaga Sep 27 '21 at 01:04
  • I'll reprint part of the question for clarity: `, `. There is no confusion about where the `namedtuple`s are declared. I invite anyone to put a breakpoint at the first `assert` and explain within the context of the debugger how the two objects differ except for memory address - that would make a beneficial answer for future readers too, IMHO. – Drakes Sep 27 '21 at 02:04

1 Answers1

0

It isn't entirely clear what you mean by semantically equivalent precisely. But consider:

>>> from collections import namedtuple
>>> X1 = namedtuple("X", "x")
>>> X2 = namedtuple("X", "x")

Then you can use something like:

>>> def equivalent_namedtuple_types(t1, t2):
...     return (t1.__name__, t1._fields) == (t2.__name__, t2._fields)
...
>>> equivalent_namedtuple_types(X1, X2)
True
>>>

From your comments, it seems like you may care about the .__module__ attribute as well.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • I'm involved in a difference engine. It essentially compares two objects, say `a` and `b`. The rational thing to do before checking that fields are the same is to check the **types** are the same first. So, given such a comparator that works wonderfully 99% of the time, the redeclared (in the same module) `namedtuple` case appears and is a problem. Given my OP, I'd like the the two objects to interpreted as equal because they have the same "class" and the same member fields. – Drakes Sep 27 '21 at 02:10
  • @Drakes ok, and you can use the above to check if they have the same "class" before you check the tuples for equality. – juanpa.arrivillaga Sep 27 '21 at 04:28