3

What is the reason packages are distributed separately?

  • Why do we have separate 'add-on' packages like pandas, numpy?
  • Since these modules seem so important, why are these not part of Python itself?

Are the "single distributions" of Python to come pre-loaded?

  • If it's part of design to keep the 'core' separate from additional functionality, still in that case it should at least come 'pre-imported' as soon as you start Python.

  • Where can I find such distributions if they exist?

HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
Dhiraj
  • 3,396
  • 4
  • 41
  • 80
  • This will probably attract some opinion-based answers from people who agree/disagree that the libraries are "required". Interesting question nonetheless. – byxor Mar 26 '17 at 21:29
  • 7
    There are too many "important" libraries that should be prebuilt in that case. Beyond numerical computation etc... What you need is a special distribution of python, *e.g.* anaconda. – keepAlive Mar 26 '17 at 21:30
  • 3
    Python has been around far longer than pandas and numpy . Whether to include them is simply a matter of judgement. However, if you'd rather deal with a more streamlined install, look into anaconda. Anaconda installs python along with many other non standard libraries. It's intended to be a one stop shop for many of the scientific computing community. – piRSquared Mar 26 '17 at 21:31
  • 1
    You don't want to blow things up or bloat the language too much by including gazillian Python packages into the core language. Btw, not all these packages are overseen by Guido, the BDFL. So, it's upto him & Python core committee to decide. On a side note, as an engineer it's up to him/her to know how to patch things up. – kmario23 Mar 26 '17 at 22:04
  • 2
    They're great libraries but they're also 800-pound gorillas. You can already achieve a lot with Python and its standard libraries. Keep in mind that Python is not only used on desktops but also on embedded, Raspberry Pi, TV receivers... Those small appliances don't need pandas or numpy. – Eric Duminil Mar 26 '17 at 22:05
  • 2
    Python is used for a lot of things besides numerical work and data series. Full stack web servers use packages like `Django` or `flask`. Most of Python utilities in Linux distributions don't need these addons. Look at the Python tag, and compare its statistics with the related ones. – hpaulj Mar 26 '17 at 22:39
  • You want to _automatically_ import Numpy and Pandas? Seriously? Plenty of people use Python without _ever_ using Numpy, what to speak of Pandas. Numpy is great when you need it, but it's a rather large imposition if you aren't going to use it. Numpy takes ages to load the first time you import it (although it's reasonable fast on subsequent runs due to file caching by the OS). BTW, if you want to automatically import modules in an interactive session, you can put them into your [PYTHONSTARTUP](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONSTARTUP) script. – PM 2Ring Mar 28 '17 at 10:58
  • I recommend that this be re-opened. It is not primary opinion based. One specific question is asks is why Python packages are not pre-bundled. This has a specific non-opinion bases answer that the packages are separated developed and tested. The other specific question whether pre-bundled distributions exist and where to find them. This has a specific non-opinion based answer that "yes' they exist and there are several well-known sources. I believe this question and the answer have value to other SO users and should be re-opened. – Raymond Hettinger Mar 28 '17 at 22:36

4 Answers4

9

Many of these tools, including core Python, are separately developed and distributed by different team, so it is up to aggregators to curate them and put them into a single distribution. Here are some notable examples:

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
4

This is a bit like asking "Why doesn't every motor come with a car around it?"

While a car without a motor is pretty useless, the inverse doesn't hold: Most motors aren't even used for cars. Of course one could try selling a complete car to people who want to have a generator, but they wouldn't buy it.
Also the people designing cars might not be the best to build a motor and vice versa.

Similarly with python. Most python distributions are not used with numpy, scipy or pandas. Distributing python with those packages would create a massive overhead.

However, there is of course a strong demand for prebuilt distributions which combine those modules with a respective python and make sure everything interacts smoothly. Some examples are Anaconda, Canopy, python(x,y), winpython, etc. So an end user who simply wants a car that runs, best chooses one of those, instead of installing everything from scratch. Other users who do want to always have the newest version of everything might choose to tinker them together themselves.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
2

You can make the interactive interpreted launch with "pre-imported" modules, as well as with pre-run code, using The Interactive start-up file.
Alternatively, you can use The Customization modules to pre-run code on every invocation of python.

Regarding whether pandas and numpy should be part of the standard library - it's a matter of opinion.

A. Rom
  • 131
  • 6
-1

PyPi currently has over 100,000 libraries available. I'm sure someone thinks each of these is important.

Why do you need or want to pre-load libraries, considering how easy a pip install is especially in a virtual environment?

Chris Johnson
  • 20,650
  • 6
  • 81
  • 80