2

tl;dr How do I use dockers layering technique without a requirements.txt for python3?

Long story:
A well formed Dockerfile for python should look as follows:

FROM python
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
ENTRYPOINT["python3","sample.py"]

Which increases build times of docker containers significantly because downloading and installing libraries in subsequent builds can be cached by docker.

Now I have a rather complex python application, my team decided to get rid of the requirements.txt and have a single point of truth for our application: our setup.py file.

Which is great from a usability perspective. pip install . for our average user. pip install .[dev] for our developers. If changes happen they happen in our setup.py.

However from a DevOps perspective this is not as nice. Unfortunately it is not possible to execute setup.py without /src being present. But if /src is present before installing our app it is not possible to make use of docker's layering technique.

So it is a trade off between usability and build speeds.

My workaround is to auto generate a requirements.txt for CI/CD purposes which feels odd. However it is still better to auto generate than to manage dependencies in two files (requirements.txt/setup.py).

I try to reduce build times of my containers, while at the same time keeping requirements for developers and end users as comprehensible as possible.

What did I miss?


Here is some code:

# Directory layout
.
├── Dockerfile
├── gen-requirements.py
├── setup.py
└── src
    └── greatapp
        ├── __init__.py
        └── sample.py

Not working Dockerfile

# DOES NOT WORK
FROM python
RUN mkdir -p /home/app/
WORKDIR /home/app/
COPY setup.py setup.py
RUN pip install .[dev]
COPY src/ src/
ENTRYPOINT ["great-app"]

Working Dockerfile

FROM python
RUN mkdir -p /home/app/
WORKDIR /home/app/
COPY ["setup.py", "gen-requirements.py", "./"]
RUN python3 gen-requirements.py \
    && pip install -r requirements.txt 
COPY src/ src/
RUN pip install .[dev]
ENTRYPOINT ["great-app"]

gen-requirements.py

#!/usr/bin/env python3
# Adapted from https://stackoverflow.com/questions/24236266/how-to-extract-dependencies-information-from-a-setup-py
# https://stackoverflow.com/users/748858/mgilson
#
# Gen-requirements extracts requirements from the setup.py to generate a requirements.txt.
# This allows for a single source of dependencies (truth) in our setup.py file.
# Only used for CI purposes.

from unittest import mock
import setuptools

with mock.patch.object(setuptools, "setup") as mock_setup:
    import setup  # This is setup.py which calls setuptools.setup

_, kwargs = mock_setup.call_args

install_requires = kwargs.get("install_requires", [])
extras_require = kwargs.get("extras_require", [])["dev"]

with open("requirements.txt", "w") as f:
    f.write(f"# Auto generated by {__file__} from setup.py.\n# Change requirements in setup.py.\n")
    f.write("\n".join(install_requires + extras_require))

setup.py

#!/usr/bin/env python3
import os
from setuptools import find_packages, setup


install_requirements = [
    "tensorflow",
]

development_requirements = [
    # Tests
    "tox",
]

extras = {"dev": development_requirements}

setup(
    name="greatapp",
    package_dir={"": "src"},
    packages=find_packages(where="src"),
    entry_points="""
        [console_scripts]
        great-app=greatapp.sample:main
    """,
    install_requires=install_requirements,
    extras_require=extras,
)

sample.py

#!/usr/bin/env python3
def main():
    a = 3
    b = 3
    print(a + b)

pag
  • 21
  • 3
  • what are you trying to optimize here, installation time? or what difficulty you have in installing package as you start the docker, if you have freeze on the python package, then you copy the installed pip packages along with your image. – vinsent paramanantham Dec 17 '20 at 13:29
  • 1
    @vinsentparamanantham I try to reduce time for building. Without proper docker layering I have to download tensorflow everytime I make a small change. – pag Dec 17 '20 at 13:55
  • I think you can take a docker image where you have a stable tensorflow installed and built after that.. or you can try using docker compose file as well, so Docker compose can have a stable image and in your startup you can install all minimal packages and scripts to copied in the dev ops pipeline. – vinsent paramanantham Dec 17 '20 at 13:59
  • @vinsentparamanantham This would mean that for every project I am responsible for I had to create a new base container whenever any of the libraries recieves a version bumb. I would rather not spend my time on doing so. – pag Dec 17 '20 at 14:49
  • moslty we use some base image with the installed packages, and then push the code to run on it. – vinsent paramanantham Dec 17 '20 at 15:08

0 Answers0