8

I became curious about the implementation of the "in" (__contains__?) operator in python due to this SO question. I downloaded the source code and tried to grep, browse, etc. to find some base definition of it, but I haven't been successful. Could someone show me a way to find it?

Of course a general approach to finding that kind of thing would be best so anyone like me can learn to fish for next time.

I'm using 2.7 but if the process is totally different for 3.x, that would be nice to have both techniques.

Community
  • 1
  • 1
KobeJohn
  • 7,390
  • 6
  • 41
  • 62
  • I have PyCharm on another computer that I don't have access to now. I think I could use that to look up definitions until reaching some base reference, but is there some way to do it without a lookup tool like that? – KobeJohn Feb 01 '12 at 02:07

2 Answers2

2

I think the implementation starts in PySequence_Contains in Objects/abstract.c. I found it by looking through the implementation of operator.contains in Modules/operator.c, which wraps all the native operators in Python functions.

Cristian Ciupitu
  • 20,270
  • 7
  • 50
  • 76
millimoose
  • 39,073
  • 9
  • 82
  • 134
  • Trying to follow that rabbit hole now. I had found PySequence_Contains before but didn't know the abstract.c and operator.c were the places to find the definition. How would I be able to figure that out myself? Just have to know the organization of the python source code? – KobeJohn Feb 01 '12 at 02:25
  • 1
    @yakiimo I found `operator.c` by doing an `ack --type=cc contains`. (Get [`ack`](http://betterthangrep.com/) and stop using `grep` to search source code like right now.) – millimoose Feb 01 '12 at 02:29
  • @yakiimo A hypothethical train of thought to follow would be that `in` is an operator, all operators have a wrapper in the `operator` module. Look at the docs for that module to find out the name of the function, then find the implementation of that module in the source, and search that file for the name of the wrapper function. – millimoose Feb 01 '12 at 02:34
  • the problem with that train of thought though is that it presupposes knowledge that "all operators have a wrapper in the operator module" which I didn't know. Or am I missing something? – KobeJohn Feb 01 '12 at 02:39
  • @yakiimo I just assumed that someone about to poke around the Python source code would already have some degree of familiarity with the Python standard library. (And that they'd have the same inclination to use `operator.attrgetter()` with `list.sort()` as I do.) – millimoose Feb 01 '12 at 02:55
  • I'm going to post my sequence as an answer. If you have any comments on how to improve that for anyone else that might find it helpful, please let me know. – KobeJohn Feb 01 '12 at 02:59
  • 2
    I often use [cscope](http://cscope.sourceforge.net/) for this kind of thing. It helps a lot to be able to search for a string in a certain context, rather than just any occurrence. On the minus side, cscope only works for C and C++ (and maybe Java). – jjlin Feb 01 '12 at 03:20
2

Based on Inerdial's push in the right direction, here is how I dug down to it. Any feedback on a better way to go about this would be appreciated.

ack --type=cc __contains__

leads to operator.c

spam2(contains,__contains__,
 "contains(a, b) -- Same as b in a (note reversed operands).")

which renames "contains" to "op_contains" which in turn points to PySequence_Contains (not in this file)

ack --type=cc PySequence_Contains

leads to the definition in abstract.c (just the return below)

result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);

which leads to _PySequence_IterSearch (just the comparison below)

cmp = PyObject_RichCompareBool(obj, item, Py_EQ);

which leads to PyObject_RichCompareBool (not in this file)

ack --type=cc PyObject_RichCompareBool

leads to the definition in object.c where I finally find the implementation of the comparison that is doing identity check before the actual equality check (my original question about the comparison).

/* Quick result when objects are the same.
   Guarantees that identity implies equality. */
if (v == w) {
    if (op == Py_EQ)
        return 1;
    else if (op == Py_NE)
        return 0;
}
KobeJohn
  • 7,390
  • 6
  • 41
  • 62