0

I am looking for a better way to structure my research projects. I have the following setup:

There are projects a,b,c and a library lib. Each project tackles a different research question and the library carries code that is used across projects. Thus all projects depend on lib. Things get more complicated as project c depends on projects a and b as well. When I work on project c, I will also update a,b or lib simultaneously. Each project is in a separate git repository.

So far I have dealt with this situation by including the dependencies above via git submodule and all the source files are located in the root dir of the project. The advantage is that I keep track of which version of lib my projects depend. Also one of my projects could depend on an outdated version of lib. I run everything from the root directory without "installing" any of the packages to site-packages or so. When a path is not set correctly, I override it via sys.path.insert.

However, the following points make me want to change layout:

  • I keep losing track of which version of lib I am editing.
  • I want to make use of automated testing tools (tox,jenkins etc.) which seem to be much easier to handle with a standard project setup.
  • sys.path.insert can lead to subtle problems which are hard to debug.
  • I usually want all my projects to work with the tip of lib anyway.

Therefore I am currently rearranging all projects (especiall lib) to be in line with the standard Python directory structure (source stored in a subdirectory, root contains a setup.py file) to be able to work in a virtualenv. Then I can list all my dependencies in requirements.txt. First I install lib as develop via pip install -e . Then I run pip freeze > requirements.txt which then includes a line similar to this.

-e git+<path_to_remote>@<sha>#egg=`lib`

So again I have generated a dependency to a specific commit (sha) as with git submodule, ensuring that I can checkout an old commit and the project should run. I can now install everything in a virtualenv and got rid of my path problems. Great.

I face some new trouble though. One problem is, how to update the sha in requirements.txt. The easiest (but probably not most elegant) solution I see is to write a pre-commit hook that updates the sha before commiting. Is there a better way?

And more generally - do you see a better solution given my setup?

bjonen
  • 1,503
  • 16
  • 24

1 Answers1

0

As far as I see you have mostly solved your problem and there are only small bits left.

1) Don't use hashes to identify versions of your libraries. Even if you don't publish your libraries to the Cheese Shop, do a normal library versioning (semver) and tag you git repositories accordingly. Thing way you will have human-readable and manageable version in your git+https://github.com/... URLs of dependencies.

2) Make your tox setup in the way that will let you test stable version of dependencies (that you have tagged last time) and master version right from the latest repo revision.

saaj
  • 23,253
  • 3
  • 104
  • 105