1

I have a package that is giving me trouble (netCDF4). I found out that if I download and build the source code myself, it works fine.

There are two ways to install this package from source. For the examples below, the source code is at ~/netcdf.

  1. use pip install ~/netcdf, it will build and run just fine.

  2. use python3 -m build from the ~/netcdf directory and then install the wheel/tar.gz file from the dist subfolder.

The former works fine. But the latter yields a package that seems to be incomplete.

A good test for it is running `python3 -c "from netCDF4 import Dataset". With the pip-installed source code that works fine. With my own build, that results in an error:

ImportError: cannot import name 'Dataset' from 'netCDF4' (unknown location)

What is the difference between wheels generated by python3 -m build and pip install? Note that the package must be compiled (it is not pure python).

NOTE: I don't need help with netCDF4. I want to understand why both builds yield different results.

To reproduce:

wget https://github.com/Unidata/netcdf4-python/archive/refs/tags/v1.6.2.tar.gz
tar -xvzf v1.6.2.tar.gz
cd netcdf4-python-1.6.2
python3 -m build  # this creates its own, clean virtual env for building (depends on package **build**)
python3 -m venv ./env && source env/bin/activate
pip3 install dist/netCDF4-1.6.2-cp310-cp310-linux_x86_64.whl
python3 -c "from netCDF4 import Dataset"  # I get an error

deactivate && rm -rf ./env
python3 -m venv ./env && source env/bin/activate
pip install .
python3 -c "from netCDF4 import Dataset"

During the python3 -m build routine, I also get some compiler warnings which I don't see using the pip installation, but I assume these are suppressed as I doubt that pip magically corrects these C impurities.

bluppfisk
  • 2,538
  • 3
  • 27
  • 56
  • 1
    It should lead to the same result in principle. If not, then it might be because there is something wrong in *netCDF4*. -- Are you sure that *netCDF4* is installed? `python3 -m pip show netCDF4`? – sinoroc Mar 29 '23 at 08:35
  • yes; it is installed and shows up in `pip show`. I can also `import netCDF4`, but `Dataset` is not part of it. I don't have this problem anymore with the newer version of netCDF4 (1.6.3) but I mostly want to understand why there's a difference between the two wheels. – bluppfisk Mar 29 '23 at 08:44
  • Why not write this clearly in the question straight from the start, then? I can not help if not all relevant details are laid out in the question. Why did I have to do research to figure out which project you are talking about? Why didn't you provide link to the library's GitHub or PyPI page? -- You are not likely to get help if you do not make it easy for others to help you. -- I recommend your rewrite the whole question top to bottom by being extra clear and hyper focused on the actual issue and provide links and references where necessary. – sinoroc Mar 29 '23 at 08:47
  • Maybe the difference is that the source you downloaded is different than the version that *pip* found on _PyPI_. The source code might have a lot of changes that have not been released on _PyPI_ yet. I do not know. Maybe ask the maintainers of the library directly rather than here on StackOverflow. – sinoroc Mar 29 '23 at 08:51
  • I appreciate your help, however I don't need my problem with netcdf4 solved. I came to understand the difference between the two build options, because the very same source code downloaded from their official release yields different results if built with `pip install source-code/` than if built with `python3 -m build && pip install dist/`. – bluppfisk Mar 29 '23 at 09:00
  • OK, so *netCDF4* might be completely irrelevant here? Can you reproduce this behavior with another Python project or only with *netCDF4*? – sinoroc Mar 29 '23 at 09:04
  • I've never encountered anything of the like, but since the resulting wheels are different, there must be something that pip install does, and python build doesn't (and I don't know what). PS updated my question to clarify that netcdf4 is not the problem. – bluppfisk Mar 29 '23 at 09:06
  • 1
    In principle it should be the same. Typically *pip* always installs from a *wheel*. So if you instruct *pip* to install from a source code directory, *pip* always starts by building a *wheel* of it first, the same way that `python -m build` does. There really should not be any difference. -- If you edit your question to place a link to the source code of the library, we might be able to spot an issue in their code (but unlikely). -- What might happen is that you are not actually testing what you think you are testing. Make sure to use clean virtual environments, and always use `python -m ...`. – sinoroc Mar 29 '23 at 09:10
  • I am worried by your inconsistent use of commands, I recommend you read this: https://snarky.ca/why-you-should-use-python-m-pip/ -- So make sure to create fresh clean virtual environments, and also make sure to use `path/to/venv/bin/python -m pip install ...` and then to test `path/to/venv/bin/python -m pip show netCDF4` and `path/to/venv/bin/python -c 'from netCDF4 import Dataset'`. – sinoroc Mar 29 '23 at 09:15
  • the environments are clean. The source code is downloaded from: https://github.com/Unidata/netcdf4-python/archive/refs/tags/v1.6.2.tar.gz. I added some steps to reproduce. Glad to hear that it should in principle be the same, that helps my understanding. Although still somewhat confused that the results are different. – bluppfisk Mar 29 '23 at 09:32
  • I tried to reproduce the issue, but `python -m build` fails (something `ValueError: invalid literal for int() with base 10: '/*!<'`). Maybe if it is possible for you to create a Dockerfile for the failing case, I could look at it. If not, try to post the full console output. -- I saw in the ["Developer install" doc](https://unidata.github.io/netcdf4-python/#developer-install) that they recommend the `python setup.py build` and `python setup.py install` way of installing. The `setup.py` seems pretty complex, maybe that is why it does not work with the standard way (`python -m build`). – sinoroc Mar 29 '23 at 10:37
  • Even their CI/CD seems to use `python setup.py install` instead of the standard methods. -- Their packaging setup is pretty complex (and does not follow modern practices). -- So to answer the question: in principle both should deliver the same result, but they don't. I can not reproduce. If I had the full trace output maybe I could spot something. But it is probably best to ask them directly. – sinoroc Mar 29 '23 at 10:43
  • 1
    Your time and effort are appreciated. I am daft - their 1.6.2 contained a bug that wouldn't allow building on some (Debian) systems because of that error you have above. I patched that (https://github.com/Unidata/netcdf4-python/pull/1219/files) but that's not available until 1.6.3. Sorry. I'll go ask them directly what's up. – bluppfisk Mar 29 '23 at 12:11

0 Answers0