1

I am having difficulty using lxml and h5py together in the same package. When they have both been imported, even if they are not imported in the same file, calling lxml.etree.parse() always segfaults.

In [1]: from lxml import etree
In [2]: parser = etree.XMLParser(dtd_validation=True, attribute_defaults=True)
In [3]: etree.parse('foo.xml', parser)
Out[3]: <lxml.etree._ElementTree at 0x1bb9638>

versus

In [1]: import h5py
In [2]: from lxml import etree
In [3]: parser = etree.XMLParser(dtd_validation=True, attribute_defaults=True)
In [4]: etree.parse('foo.xml', parser)
Segmentation fault

Switching the order of imports does not seem to matter. Any thoughts on avoiding this while still importing both packages?

Edit: Adding a bit of info that I should have added before. The same thing happens if this is done in a script rather than IPython.

Vorticity
  • 4,582
  • 4
  • 32
  • 49
  • Does this only happen when in Ipython? Do you get the same effect from a script? – ebarr Apr 01 '14 at 08:31
  • Same effect in a script. I had the same thought... – Vorticity Apr 01 '14 at 09:04
  • And do you get the same error if you use anything other than `etree,parse`. A possibility is that your h5py library is corrupted and it is only when python tries to access some specific memory that it segfaults (or that it segfaults on import, but it isn't registered until you try and do something). – ebarr Apr 01 '14 at 09:42
  • It does the same thing with objectify.parse, but I expect that makes use of etree.parse. Are there options in lxml that would avoid use of etree.parse? – Vorticity Apr 01 '14 at 10:00
  • switch from `lxml` to `xml` or `elementtree` or any of the other xml parsers. In fact forget parsing the XML at all and just read the file to see if you get the same error. You could also try running the script with `valgrind` or `gdb` to see which library is producing the fault. – ebarr Apr 01 '14 at 10:03
  • I'll give it a shot when I get back to work in the morning. Thanks for the suggestions. – Vorticity Apr 01 '14 at 11:23
  • Well, I feel stupid now because I solved this problem a couple of years back and it wound up being pretty simple. It isn't an ideal solution because it can cause problems for users of the package if they need to import lxml, but switching the order of lxml and h5py so that lxml is imported first fixes the problem. I'm debating whether to just delete this question, answer it, or leave it until I find a better way to fix the problem. – Vorticity Apr 04 '14 at 22:33
  • Please don't delete. This is useful information and it may help others in the future. I would suggest you put a short answer to your own question along the lines of your comment. If you get a better fix down the line then you can always come back an edit it. – ebarr Apr 05 '14 at 02:15

0 Answers0