5

I want to create a docker-compose.yml containing our company analysis toolchain. For this purpose, I add dask. The docker-compoe.yml looks like this:

docker-compose.yml

version: '3'
services:
  jupyter:
    build: docker/jupyter/.
    ports:
      - "8899:8899"
    depends_on: 
      - dask-scheduler
      - dask-worker
    volumes:
      - ./notebooks:/notebooks

  dask-scheduler:
    build:
      docker/dask/.
    hostname: dask-scheduler
    ports:
      - "8786:8786"
      - "8787:8787"
    volumes:
      - ./notebooks:/notebooks
    command: ["dask-scheduler"]

  dask-worker:
    build:
      docker/dask/.
    depends_on:
      - dask-scheduler
    volumes:
      - ./notebooks:/notebooks
    command: ["dask-worker", "tcp://dask-scheduler:8786"]

For building up the both dask containers, I use this Dockerfile:

docker/dask/Dockerfile

FROM python:3.7
RUN apt-get update -y && apt-get install -y python3-pip libsnappy-dev
RUN pip install numpy
RUN pip install dask
RUN pip install distributed
RUN pip install fsspec
RUN pip install fastavro
RUN pip install python-snappy
RUN pip install dask[bag]
RUN pip install dask[dataframe]
RUN pip install jupyter-server-proxy

# Dashboard
EXPOSE 8787
# Scheduler
EXPOSE 8786

In my notebook, I use following code snipped to connect to the scheduler:

from dask.distributed import Client
client = Client(address="dask-scheduler:8786")
client.dashboard_link 

=> 'http://dask-scheduler:8787/status'

Using the IP of the container also not works.

this allows me to do my requested computation and works fine. But what is not working is the dashboard, that should be available on http://localhost:8787/status. This just returns

404: Not Found

My questions 1 is: What I am doing wrong? I found the --dashboard-address

argument in the docs and tried various combinations, but this now made any changes regarding the output of the dashboard. This goes to my second question:

Why is the argument available in the scheduler and the worker

and finally what changes I need to do, to make it work? Using Docker Desktop Community on Mac OS Version 2.3.0.3 with Engine 19.03.8

Thanks for any hints.

user2757652
  • 353
  • 2
  • 9
  • Check https logs - your request reaches the server technically speaking but it seems that resource you're trying to access is not there – Marcin Orlowski Jun 12 '20 at 13:39
  • can you give me a hint where I can find them? as stdout is printing nothing as far as I could see. – user2757652 Jun 12 '20 at 14:24
  • httpd is part of your container so its logs usually are there too. – Marcin Orlowski Jun 12 '20 at 16:29
  • I have added `RUN pip install dask[diagnostics]` to the Dockerfile, no changed. I tried adding some bokeh env variables for more logging but that not helped either. Also not found the log files of the server so far. I logged into the container and tried to curl it but get the same error. root@dask-scheduler:/# curl -I -L localhost:8787 HTTP/1.1 404 Not Found Server: TornadoServer/6.0.4 Content-Type: text/html; charset=UTF-8 Date: Mon, 15 Jun 2020 06:31:49 GMT Content-Length: 69 – user2757652 Jun 15 '20 at 06:33

1 Answers1

2

After a long trip of debugging, I could finally break it on a before running Environment. Given bokeh="==2.0.2" shows the dashboard as expected. But using latest Version bokeh="==2.1.0" in my Pipefile showed the appropriate error message. Maybe it also a combination of different versions of various packages.

In case somebody else find this: Fix your version of bokeh to 2.0.2 to have the Dashboard back. Using latest with no fixed versions at all will break it. So its not related to docker or docker-compose.


EDIT: Its now fixed in the latest dask release 2.19.0 - so updating your dask dependencies should work as well.

Kermit
  • 4,922
  • 4
  • 42
  • 74
user2757652
  • 353
  • 2
  • 9