Dask.distributed is a lightweight library for distributed computing in Python. It extends both the concurrent.futures and dask APIs to moderate sized clusters.
Questions tagged [dask-distributed]
1090 questions
0
votes
2 answers
nested dask.compute not blocking
dask.compute(...) is expected to be a blocking call. However when I have nested dask.compute, and the inner one does I/O (like dask.dataframe.read_parquet), the inner dask.compute is not blocking. Here's a pseudo code example:
import dask,…

user1527390
- 123
- 1
- 3
- 7
0
votes
1 answer
Using Dask compute causes execution to hang
This is a follow up question to a potential answer to one of my previous questions on using Dask computed to access one element in a large array .
Why does using Dask compute cause the execution to hang below?
Here's the working code…

sudouser2010
- 171
- 1
- 6
0
votes
1 answer
Node process with dedicated memory in Python
I'm developing with Apache and Django an web application application where users interacts with a data model (C++ implementation wrapped into Python).
To avoid load / save data in a file or database after each user operation, I prefer keep data…

user3790252
- 51
- 4
-1
votes
2 answers
Dask map_partitions meta when using lambda function to add column
I am using Dask to apply a function myfunc that adds two new columns new_col_1 and new_col_2 to my Dask dataframe data. This function uses two columns a1 and a2 for computing the new columns.
ddata[['new_col_1', 'new_col_2']] =…

S_S
- 1,276
- 4
- 24
- 47
-1
votes
2 answers
Creating different types of workers that are accessed using a single client
EDIT:
My question was horrifically put so I delete it and rephrase entirely here.
I'll give a tl;dr:
I'm trying to assign each computation to a designated worker that fits the computation type.
In long:
I'm trying to run a simulation, so I represent…

Ben Hatzofe
- 11
- 1
-1
votes
3 answers
relation between regular Dask and dask.distributed
I don't understand the relation between regular Dask and dask.distributed.
With dask.distributed, e.g. using the Futures interface, I have to explicitly create a client, which is backed by a local or remote cluster, and then submit to it using…

A. Donda
- 8,381
- 2
- 20
- 49
-1
votes
1 answer
Python Dask Apply Function and STore Result in Same Column
Hello i am bit new on Dask and i am trying to do the following things
i have a CSV file I am reading file everything works fine
import pandas
import os
import json
import math
import numpy as np
import dask
from dask.distributed import…

Soumil Nitin Shah
- 634
- 2
- 7
- 18
-1
votes
1 answer
which one to use for model tuning: dask-kubernetes versus dask-yarn
I am newbie in dask & considering using it for parallelization for ml model tuning purposes.
Should i try dask-yarn or dask-kubernetes for such requirement?
Any general ideas on where to use which of these will also be helpful for broader…

Himanshu Gautam
- 359
- 1
- 4
- 17
-1
votes
1 answer
Dask in the python REPL - is it possible to set a progress bar?
I am using Dask in the python REPL. Is it possible to set a progress bar?

power
- 1,680
- 3
- 18
- 30
-2
votes
1 answer
How to resolve Kernel Error or Memory Error?
I had and array of strings whose length is 50000. I am trying to create a a similarity matrix of dimension 50000 * 500000. In order to make it i tried forming the list of tuples using the following code:
terms = [element for element in…

Vas
- 918
- 1
- 6
- 19