0

I am trying to execute Qiskit using Dask. When I submit a function that execute a quantum circuit simulation using Qiskit, it does not work properly and additionally, workers start to produce errors with callbacks. So i decide to start preloading Qiskit on workers, but it does not work. Workers stop gracefully without any error. However, importing another python package as numpy, they work as expected. Any idea why workers cannot load Qiskit?

This is a trace of the problem:

[user@c6601 ~]$ conda --version
conda 4.7.12
[user@c6601 ~]$ conda activate qiskit
(qiskit) [user@c6601 ~]$ 
(qiskit) [user@c6601 ~]$ python --version
Python 3.7.7
(qiskit) [user@c6601 ~]$ python -c "import qiskit; print(qiskit.__qiskit_version__)"
{'qiskit-terra': '0.14.1', 'qiskit-aer': '0.5.1', 'qiskit-ignis': '0.3.0', 'qiskit-ibmq-provider': '0.7.1', 'qiskit-aqua': None, 'qiskit': '0.19.2'}
(qiskit) [user@c6601 ~]$ dask-worker --version
dask-worker, version 2.17.0
(qiskit) [user@c6601 ~]$ dask-scheduler --scheduler-file /tmp/sched.json&
[1] 16228
(qiskit) [user@c6601 ~]$ distributed.scheduler - INFO - -----------------------------------------------
distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO - Clear task state
distributed.scheduler - INFO -   Scheduler at:    tcp://10.120.66.1:8786
distributed.scheduler - INFO -   dashboard at:                     :8787
(qiskit) [user@c6601 ~]$ dask-worker --scheduler-file /tmp/sched.json --preload "import qiskit;print(qiskit.__qiskit_version__)"
distributed.utils - INFO - Reload module tmp2c9jac8m from .py file
{'qiskit-terra': '0.14.1', 'qiskit-aer': '0.5.1', 'qiskit-ignis': '0.3.0', 'qiskit-ibmq-provider': '0.7.1', 'qiskit-aqua': '0.7.1', 'qiskit': '0.19.2'}
{'qiskit-terra': '0.14.1', 'qiskit-aer': '0.5.1', 'qiskit-ignis': '0.3.0', 'qiskit-ibmq-provider': '0.7.1', 'qiskit-aqua': '0.7.1', 'qiskit': '0.19.2'}
distributed.preloading - INFO - Import preload module: /scratch/4070613/tmp2c9jac8m.py
distributed.dask_worker - INFO - End worker

But numpy loads without problems.

(qiskit) [user@c6601 ~]$ dask-worker --scheduler-file /tmp/sched.json --preload "import numpy;print(numpy.__version__)"
distributed.utils - INFO - Reload module tmpm2y2lp42 from .py file
1.18.1
1.18.1
distributed.preloading - INFO - Import preload module: /scratch/4070613/tmpm2y2lp42.py
distributed.nanny - INFO -         Start Nanny at: 'tcp://10.120.66.1:46577'
distributed.utils - INFO - Reload module tmpzhdz9u4h from .py file
1.18.1
1.18.1
distributed.preloading - INFO - Import preload module: /scratch/4070613/tmpzhdz9u4h.py
distributed.worker - INFO -       Start worker at:    tcp://10.120.66.1:34459
distributed.worker - INFO -          Listening to:    tcp://10.120.66.1:34459
distributed.worker - INFO -          dashboard at:          10.120.66.1:45970
distributed.worker - INFO - Waiting to connect to:     tcp://10.120.66.1:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -               Threads:                          4
distributed.worker - INFO -                Memory:                   21.47 GB
distributed.worker - INFO -       Local Directory: /mnt/netapp2/Home_FT2/home/cesga/user/dask-worker-space/worker-dleqkfmk
distributed.worker - INFO - -------------------------------------------------
distributed.scheduler - INFO - Register worker <Worker 'tcp://10.120.66.1:34459', name: tcp://10.120.66.1:34459, memory: 0, processing: 0>
distributed.scheduler - INFO - Starting worker compute stream, tcp://10.120.66.1:34459
distributed.core - INFO - Starting established connection
distributed.worker - INFO -         Registered to:     tcp://10.120.66.1:8786
distributed.worker - INFO - -------------------------------------------------
distributed.core - INFO - Starting established connection
  • Job submission is an sync process. If I had to guess, this or perhaps transpilation, that spawn threads or processes respectively is likely the issue here. – Paul Nation Jun 03 '20 at 07:28
  • Thanks Paul. But I amd doing nothing with qiskit. I am only importing the library. – Andres Gomez Jun 13 '20 at 18:43

1 Answers1

0

Dask does many things that some libraries aren't designed for. Common causes include the following:

  1. Running functions from many different threads. Some C/C++ libraries have global state that, if mismanaged, will terminate programs without warning.
  2. Serialization. Dask sometimes needs to move Python objects around. Many libraries don't know how to turn their objects into bytes and back, and so fail. These usually err more loudly though.
MRocklin
  • 55,641
  • 23
  • 163
  • 235