2

I get a segmentation fault when trying to display an object of a class defined in a C extension.

In [1]: import moose
on node 0, numNodes = 1, numCores = 2

In [2]: a = moose.Neutral('a')

In [3]: print a
<moose.Neutral: id=459, dataIndex=0, path=/a[0]>

In [4]: a
Segmentation fault (core dumped)

I implemented repr and str functions in C, but using gdb I figured that it does not even go to those functions. It works for another class in the same extension. Standard Python 2.7 interpreter also works fine.

print a works in the IPython console.

I checked my refcounts and that does not seem to be problematic (PyObject_New initializes the refcount to 1).

So what is the special thing that IPython is doing when I enter a variable identifier directly in the console that causes the segfault?

It is a pretty large module and a minimal example is nontrivial. The repr function definition is:

PyObject * moose_ObjId_repr(_ObjId * self)
{
    if (!Id::isValid(self->oid_.id)){
        RAISE_INVALID_ID(NULL, "moose_ObjId_repr");
    }
    ostringstream repr;
    repr << "<moose." << Field<string>::get(self->oid_, "className") << ": "
         << "id=" << self->oid_.id.value() << ", "
         << "dataIndex=" << self->oid_.dataIndex << ", "
         << "path=" << self->oid_.path() << ">";
    return PyString_FromString(repr.str().c_str());
} // !  moose_ObjId_repr

The gdb stack trace is below. Unfortunately I cannot use Python debug build because of a subtle issue in Python C-API. It looks like the segmentation fault occurs before control reaches a module function (the module is built with debug flags and if I do print a then a break point on the repr function catches it).

#0  0x000000000050ffab in ?? ()
#1  0x0000000000503cbc in ?? ()
#2  0x00000000004879ba in ?? ()
#3  0x000000000049968d in PyEval_EvalFrameEx ()
#4  0x00000000004a090c in PyEval_EvalCodeEx ()
#5  0x000000000049ab45 in PyEval_EvalFrameEx ()
#6  0x00000000004a090c in PyEval_EvalCodeEx ()
#7  0x000000000049ab45 in PyEval_EvalFrameEx ()
#8  0x00000000004a1c9a in ?? ()
#9  0x00000000004dfe94 in ?? ()
#10 0x0000000000505f96 in PyObject_Call ()
#11 0x00000000004dddad in ?? ()
#12 0x0000000000499be5 in PyEval_EvalFrameEx ()
#13 0x00000000004a090c in PyEval_EvalCodeEx ()
#14 0x0000000000499a52 in PyEval_EvalFrameEx ()
#15 0x00000000004a090c in PyEval_EvalCodeEx ()
#16 0x000000000049ab45 in PyEval_EvalFrameEx ()
#17 0x00000000004a1c9a in ?? ()
#18 0x00000000004dfe94 in ?? ()
#19 0x0000000000505f96 in PyObject_Call ()
#20 0x00000000004dddad in ?? ()
#21 0x00000000004dc9cb in PyEval_CallObjectWithKeywords ()
#22 0x000000000049cf17 in PyEval_EvalFrameEx ()
#23 0x00000000004a090c in PyEval_EvalCodeEx ()
#24 0x0000000000588d42 in PyEval_EvalCode ()
#25 0x000000000049e460 in PyEval_EvalFrameEx ()
#26 0x00000000004a090c in PyEval_EvalCodeEx ()
#27 0x000000000049ab45 in PyEval_EvalFrameEx ()
#28 0x00000000004a090c in PyEval_EvalCodeEx ()
#29 0x0000000000499a52 in PyEval_EvalFrameEx ()
#30 0x00000000004a090c in PyEval_EvalCodeEx ()
#31 0x0000000000499a52 in PyEval_EvalFrameEx ()
#32 0x00000000004a090c in PyEval_EvalCodeEx ()
#33 0x0000000000499a52 in PyEval_EvalFrameEx ()
#34 0x00000000004a090c in PyEval_EvalCodeEx ()
#35 0x0000000000499a52 in PyEval_EvalFrameEx ()
#36 0x00000000004a090c in PyEval_EvalCodeEx ()
#37 0x000000000049ab45 in PyEval_EvalFrameEx ()
#38 0x00000000004a1c9a in ?? ()
#39 0x0000000000505f96 in PyObject_Call ()
#40 0x000000000049b07a in PyEval_EvalFrameEx ()
#41 0x00000000004a090c in PyEval_EvalCodeEx ()
#42 0x0000000000499a52 in PyEval_EvalFrameEx ()
#43 0x00000000004a1634 in ?? ()
#44 0x000000000044e4a5 in PyRun_FileExFlags ()
#45 0x000000000044ec9f in PyRun_SimpleFileExFlags ()
#46 0x000000000044f904 in Py_Main ()
#47 0x00007ffff7818ec5 in __libc_start_main (main=0x44f9c2 <main>, argc=2, argv=0x7fffffffde28, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffde18) at libc-start.c:287
#48 0x0000000000578c4e in _start ()
subhacom
  • 868
  • 10
  • 24
  • 2
    Can you post the C and Python code? – Chris Jul 07 '15 at 17:16
  • Please also add stacktrace from `gdb`. – Mikko Ohtamaa Jul 07 '15 at 17:30
  • @Chris this is a large module and I am not sure I can create a minimal example. I am adding the repr function anyways if that is of any use. – subhacom Jul 07 '15 at 18:03
  • @MikkoOhtamaa Added stack trace. – subhacom Jul 07 '15 at 18:13
  • @subhacom: Can you try to do a debug build so we get the name of the crashing function? Also e.g. Ubuntu as `python-dgb` which contains the debug symbols. – Mikko Ohtamaa Jul 07 '15 at 18:14
  • @subhacom: Also I would try to set breakpoint in the C code itself. – Mikko Ohtamaa Jul 07 '15 at 18:15
  • @MikkoOhtamaa (1) as I mentioned in the edited post "Unfortunately I cannot use Python debug build because of a subtle issue in Python C-API." Relevant discussion here: http://mail.python.org/pipermail/python-dev/2009-July/090921.html. (2) I am setting breakpoint in the C function (in the extension module) and gdb captures control there if I do `print a`. As mentioned, I cannot use Python debug build. – subhacom Jul 07 '15 at 18:22
  • 1
    FWIW, `print(a)` prints `str(a)`, while `a` prints `repr(a)`. What happens when you write `repr(a)` in IPython? I have the feeling your `tp_repr` is not filled in, just `tp_str` – codewarrior Jul 07 '15 at 20:38
  • @codewarrior I checked that first thing in the C code :( `In [3]: print(a) In [4]: repr(a) Out[4]: ''` – subhacom Jul 07 '15 at 22:56
  • 1
    Here is the special thing IPython does to print the result of an expression: It attempts to call several special instance methods for formatting the result in different forms. `_repr_pretty_`, `_repr_html_`, `_repr_png_`, and so on for a dozen different kinds of output. If the object does not define `_repr_pretty_` when IPython tries to format the object as text, it falls back to `repr()`. Does your class define `tp_getattr`? – codewarrior Jul 08 '15 at 07:20
  • @codewarrior that sounds like a clue. yes my code defines tp_getattro. – subhacom Jul 09 '15 at 03:56
  • @codewarrior Turns out that IPython is trying access __class__ attribute and that causes a segfault - even in Python. Can you post your comment as an answer so I can accept it? I am defining the classes dynamically in C code and am forced to use this: ` new_class->tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HEAPTYPE;` It turns out that removing Py_TPFLAGS_HEAPTYPE solves the problem. But – subhacom Jul 09 '15 at 04:20
  • But what? Don't leave me hanging! I need to know why you need HEAPTYPE! – codewarrior Jul 09 '15 at 04:38
  • @codewarrior if we do not set Py_TPFLAGS_HEAPTYPE, GC tries tp_traverse on these classes (even when I unset Py_TPFLAGS_HAVE_GC) and fails the assertion in debug build of Python: `python: Objects/typeobject.c:2683: type_traverse: Assertion `type->tp_flags & Py_TPFLAGS_HEAPTYPE' failed.` I do not think these things are documented and I had to figure them out the hard way stepping through and investigating python source code. – subhacom Jul 09 '15 at 05:03

1 Answers1

1

The special behavior of IPython when printing the result of an expression is that it tries to call several special methods on the object to get its representation in different forms: _repr_pretty_, _repr_html_, _repr_png_ and so on. These are used to display the repr in things like IPython Notebooks in a web browser, or display the output of a matplotlib figure as an image. This logic is contained in IPython.lib.pretty in the RepresentationPrinter class.

In looking for these special methods, it gets the result object's class via __class__ and manually walks its method resolution order rather than calling the special methods normally. It does this to find out if any pretty-printers are registered for any of the base classes of that class. Only after no special methods and no registered pretty-printers are found does it fall back to calling repr().

The crash may be related to getting attributes from the object (tp_getattro) or attributes from the object's class (I honestly don't know how this is supported for C extension types.)

codewarrior
  • 2,000
  • 14
  • 14
  • FWIW, I found all of this out while I was trying to debug a weird problem with getting tracebacks pointing to IPython code, ultimately resulting from trying to get the representation of a PySide object wrapper whose underlying C++ object was already deleted. – codewarrior Jul 09 '15 at 04:41