6

I've googled and googled, but have found almost nothing in the way of discussions or best practices in managing larger enterprise codebases in Python. Here, I'm simply soliciting any and all pointers to such information. Here's some background and some of the questions I'm looking to answer.

We're long-time Java developers, who have solved similar problems to those mentioned below largely using well established Java best practices, as well as Maven, Ant and a Sonotype Nexus repo.

I'm talking internal software only here. We're not looking to distribute anything Python-based. We've got multiple development groups using Python, each developing sharable utility code libraries, final web applications and stand-alone tools, all in pure Python. Each group has its own Github source repository.

How do we manage our shareable code, both within a group and across groups? Do we create eggs (or something similar) and distribute and install them into the Python system? If so, would we store them in our Nexus repo like our Java jars, or is there a more Python-specific method if internal package distribution? Or, do we just share raw code, checking out sources from multiple Github repos?

If we share raw code, how do we manage getting the Python searchpath right as we bring together code from multiple repositories?

How do we manage package namespaces when we want our packages to all live in a com.ourcompany base namespace? It seems like python isn't too happy when you bring together source trees with overlapping namespaces.

How do we manage third party package versioning? I've never seen easy_install or pip passed a version number. How do we lock down third party package versions?

Do tools exist to aid in Python code reviews, CI, regression testing, etc.?

We're relative newbies to Python code, so some of these questions may have fairly obvious answers. Still, I find it surprising that I can't find more information on managing larger Python codebases.

What issues will we encounter that I haven't thought to ask about, or don't yet know enough to even know to ask about?

Any valuable pointers will be greatly appreciated.

Steve
  • 2,396
  • 2
  • 15
  • 16
  • Hey Steve, to chip away one of your many solid questions, you could "manage third party package versioning" with pip freeze > requirements.txt -- http://pip.readthedocs.org/en/latest/reference/pip_freeze.html – jtmoulia May 10 '14 at 05:20

1 Answers1

0

Well, I won't even try to answer all those (excellent) questions, but here are a few opinionated pointers which will hopefully help (as someone who works in both worlds, though more Java).

Packaging

If so, would we store them in our Nexus repo like our Java jars, or is there a more Python-specific method if internal package distribution? Or, do we just share raw code, checking out sources from multiple Github repos?

Packaging in Python is historically a bit of a mess IMHO, though it feels like it's improving. Distutils is the major / native tool here - I've not used it much, feels slightly scary in places. In general, also check recommended tools.

Pip has all but won the war of mindshare, especially when installing 3rd party libraries. I've not solved the local library problem myself, (maybe someone else reading has), but if I were, I'd probably opt for Pip with local/network-disk repos e.g. by installing from wheels.

Another option (which can cause all sorts of hassles itself) is to package in your OS's native packager, be it Debian-style apt or by creating RPMs, etc. Of course, Windows not so much.

Versioning etc

How do we manage third party package versioning? I've never seen easy_install or pip passed a version number.

Pip

Pip definitely supports version specifiers. Turns out Easy Install does too. I suppose many people / smaller projects opt for latest-and-greatest, which of course isn't always as "appropriate" in the enterprise...

Virtualenv

No discussion of versioning and Python would miss a Python2/3 reference, but I'm sure you're aware of all this already.

More important then would be to mention virtualenv. It truly frees you from the mess you can get in to testing multiple versions, bearing in mind especially that your (*NIX) operating systems typcially rely heavily on Python themselves. It's a big subject so have a look at the docs.

Developer Tooling

Do tools exist to aid in Python code reviews, CI, regression testing, etc.?

Code Review

Very much so. Most code review tools are multi-language (it's just a formatting issue really), so just pick your favourite enterprise-friendly one, be it Crucible, Github's one (Barkeep?), Gerrit, or whatever.

CI

For CI you have almost as many options again. Running python apps is usually less involved than Java ones, so most CI systems, though often Java-focused, support Python. (FWIW, we use drone.io for Quod Libet). Jenkins should have no problem doing this, and it seems people have done so with TeamCity.

However, the "original" or "most Pythonic" is probably Buildbot, but I've not used it personally. Looks a lot newer than I remember, and it had quite a lot of support in the Python community I think...

Testing

For testing, though not quite as mature as JUnit / TestNG, check out the de-facto / JUnit-like unit testing unittest, but also (nicer?) alternatives like nose.py.

For higher level (BDD) testing, try something like Lettuce - as the name implies heavily inspired by Cucumber, or maybe Behave. I've not tried them, but common opinion is they're less mature than Cucumber / JBehave / Concordion / Rspec etc.

Community
  • 1
  • 1
declension
  • 4,110
  • 22
  • 25