2

Defining two classes (a base "ClassA" and a subclass "ClassB" in two separate files), gives unexpected results when using Python's isinstance method. Output appears to be impacted by the module name (namespace?) used while running (__main__). This behavior appears on both Python 3.8.5 and 3.10.4.

File ClassA.py contains:

class ClassA:
    def __init__(self, id):
        self.id = id
    def __str__(self) -> str:
        class_name = type(self).__name__
        return f"{class_name} WITH id: {self.id}"

def main():
    from ClassB import ClassB
    id = 42
    for i, instance in enumerate([ClassA(id), ClassB(id)]):
        label = f"{type(instance).__name__}:"
        print("#" * 50)
        print(f"{label}   type: {type(instance)}")
        label = " " * len(label)  # Convert label to appropriate number of spaces
        is_a = isinstance(instance, ClassA)
        is_b = isinstance(instance, ClassB)
        print(f"{label} is_a/b: {is_a}/{is_b}")
        print(f"{label}    str: {instance}")

if __name__ == "__main__":
    main()

File ClassB.py contains:

from ClassA import ClassA

class ClassB(ClassA):
    def __init__(self, id):
        super().__init__(id)
        self.id *= -1

File main.py contains:

if __name__ == "__main__":
    from ClassA import main
    main()

The output from running ClassA.py gives:

01: ##################################################
02: ClassA:   type: <class '__main__.ClassA'>
03:         is_a/b: True/False
04:            str: ClassA WITH id: 42
05: ##################################################
06: ClassB:   type: <class 'ClassB.ClassB'>
07:         is_a/b: False/True
08:            str: ClassB WITH id: -42

While the output from running main.py (which calls ClassA.main) gives:

01: ##################################################
02: ClassA:   type: <class 'ClassA.ClassA'>
03:         is_a/b: True/False
04:            str: ClassA WITH id: 42
05: ##################################################
06: ClassB:   type: <class 'ClassB.ClassB'>
07:         is_a/b: True/True
08:            str: ClassB WITH id: -42

Notice how the type of the ClassA instance changes (on Lines 02) from '__main__.ClassA' (when run from ClassA.py) to 'ClassA.ClassA' (when run from main.py). Similarly, the isinstance type checks for ClassA and ClassB (on Lines 07) change from 'False/True' (unexpected) to 'True/True' (desired, expected).

Any comments/suggestions/explanations would be helpful. Thanks.

skuzmo
  • 111
  • 1
  • 4
  • 2
    @buran: That's not a good duplicate target; the issue here is the weirdness involved with circular imports where the `__main__` script is part of the cycle. [Using typeguard decorator: @typechecked in Python, whilst evading circular imports?](https://stackoverflow.com/q/74308059/364696) is a kinda-sorta duplicate (in that the problem is similar, and the solution and explanation are the same), but I'm not confident enough of that to invoke my dupehammer. – ShadowRanger Mar 08 '23 at 21:00

1 Answers1

5

The problem you have here is due to ClassA being defined in your main script, and you having circular imports. You actually have three modules involved in your script, not two:

  1. The main script, under the name __main__ (defines __main__.ClassA), which imports...
  2. ClassB (defines ClassB.ClassB) which imports...
  3. ClassA (defines ClassA.ClassA that is defined identically to __main__.ClassA, but is a unique and independent class) a completely separate copy of the main script, but imported independently under a different name so __main__ related behaviors don't trigger

Importantly, ClassB.ClassB is inheriting from ClassA.ClassA, but main is type-checking against __main__.ClassA, a completely unrelated class.

I've gone over why this doesn't work as expected before (and again in another context), so I'll simplify the answer here for your specific case to: Don't involve __main__ in any circular imports. It can import whatever it likes, but it should not have anything else importing from it. In this case, your main.py refactoring is enough to fix the issue (by ensuring there is only one version of class ClassA, ClassA.ClassA). It does have a circular import dependency, which is always a little weird (I'd recommend moving the function main to main.py to avoid that), but since you're deferring one of the imports it's safe enough.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Maybe I wasn't being totally clear in my explanation but running main.py works __exactly__ as I would expect. Additionally, I don't think I have any problems with circular imports; those would raise exceptions causing an exit, correct? I agree with your statements about the (confusing) number of modules but I am confused about why (when running ClassA.py) the class \_\_main\_\_.ClassA is not __equivalent__ to ClassA.ClassA. If that were the case (for the artificial \_\_main\_\_ module) then I would be all fat, dumb, and happy. – skuzmo Mar 08 '23 at 21:13
  • your linked answer went a bit over my head. However, I think the solution you mentioned in your last paragraph for this question is essentially what I implemented in the separate file "main.py", correct? I achieved it by calling ClassA's "main" but I also could have moved/copied ClassA's "main" to "main.py" and added imports for ClassA/ClassB exactly as you were suggesting. I probably should have directly asked: "Why do classes in a module lose capabilities when they are temporarily run as the \_\_main\_\_ module?" – skuzmo Mar 08 '23 at 21:50
  • 1
    @skuzmo: You have what *would* be a circular import, if not for the fact that `__main__.ClassA` and `ClassA.ClassA` are unrelated. `__main__` imports module `ClassB`, which in turn imports module `ClassA`, and it's only because `__main__` is *not* module `ClassA` that no circular import occurs. On rereading, yes, what you did with a minimal separate `main.py` was enough to fix things (because now the class `ClassA` is unique, there aren't two versions). With `main.py` you *do* have a circular import, but you've handled it safely enough so you don't fail because of it. – ShadowRanger Mar 08 '23 at 22:08
  • 1
    @skuzmo: On "Why is `__main__.ClassA` not equivalent to `ClassA.ClassA`?": Because `ClassA.py` gets imported and cached *twice*, once under the name `__main__` (so it can do main script behaviors), and once under the name `ClassA` (as an imported module). The import process is assumed to differ between the two, so importing it only once would be wrong (what if certain things are defined differently when it's `__main__` vs. `ClassA`?). It's imported twice because the main script *must* be imported with the weird name so it knows to do scripty things. – ShadowRanger Mar 08 '23 at 22:18
  • 1
    Imports are cached, but if you check `sys.modules`, you'll see it actually got cached twice, under each name, because it was read, parsed, compiled, and run twice, one ends up stored as `sys.modules['__main__']`, the other as `sys.modules['ClassA']`, and if you compare the class `ClassA` in each, you'll see it's not the same thing (`sys.modules['__main__'].ClassA is sys.modules['ClassA'].ClassA` will be `False`). – ShadowRanger Mar 08 '23 at 22:19
  • 1
    It helps to keep the distinction between *files* and *modules* clear. When you execute `ClassA.py` as a script, the *file* `ClassA.py` is used to create one module (named `__main__`) immediately, and later (via the `ClassB` modules) the same file is used to create a distinct module named `ClassA`. One file, used to create two modules. When you execute `main.py`, `ClassA.py` is only used to create one module named `ClassA`. – chepner Mar 08 '23 at 23:14
  • 1
    @chepner: Yeah, it's not Java, naming the module and the class the same thing (especially in a complicated circular dependency case like this) is unnecessary, and making things more confusing. – ShadowRanger Mar 08 '23 at 23:59