Questions tagged [dask-distributed]

Dask.distributed is a lightweight library for distributed computing in Python. It extends both the concurrent.futures and dask APIs to moderate sized clusters.

1090 questions
0
votes
2 answers

nested dask.compute not blocking

dask.compute(...) is expected to be a blocking call. However when I have nested dask.compute, and the inner one does I/O (like dask.dataframe.read_parquet), the inner dask.compute is not blocking. Here's a pseudo code example: import dask,…
user1527390
  • 123
  • 1
  • 3
  • 7
0
votes
1 answer

Using Dask compute causes execution to hang

This is a follow up question to a potential answer to one of my previous questions on using Dask computed to access one element in a large array . Why does using Dask compute cause the execution to hang below? Here's the working code…
sudouser2010
  • 171
  • 1
  • 6
0
votes
1 answer

Node process with dedicated memory in Python

I'm developing with Apache and Django an web application application where users interacts with a data model (C++ implementation wrapped into Python). To avoid load / save data in a file or database after each user operation, I prefer keep data…
-1
votes
2 answers

Dask map_partitions meta when using lambda function to add column

I am using Dask to apply a function myfunc that adds two new columns new_col_1 and new_col_2 to my Dask dataframe data. This function uses two columns a1 and a2 for computing the new columns. ddata[['new_col_1', 'new_col_2']] =…
S_S
  • 1,276
  • 4
  • 24
  • 47
-1
votes
2 answers

Creating different types of workers that are accessed using a single client

EDIT: My question was horrifically put so I delete it and rephrase entirely here. I'll give a tl;dr: I'm trying to assign each computation to a designated worker that fits the computation type. In long: I'm trying to run a simulation, so I represent…
-1
votes
3 answers

relation between regular Dask and dask.distributed

I don't understand the relation between regular Dask and dask.distributed. With dask.distributed, e.g. using the Futures interface, I have to explicitly create a client, which is backed by a local or remote cluster, and then submit to it using…
A. Donda
  • 8,381
  • 2
  • 20
  • 49
-1
votes
1 answer

Python Dask Apply Function and STore Result in Same Column

Hello i am bit new on Dask and i am trying to do the following things i have a CSV file I am reading file everything works fine import pandas import os import json import math import numpy as np import dask from dask.distributed import…
Soumil Nitin Shah
  • 634
  • 2
  • 7
  • 18
-1
votes
1 answer

which one to use for model tuning: dask-kubernetes versus dask-yarn

I am newbie in dask & considering using it for parallelization for ml model tuning purposes. Should i try dask-yarn or dask-kubernetes for such requirement? Any general ideas on where to use which of these will also be helpful for broader…
-1
votes
1 answer

Dask in the python REPL - is it possible to set a progress bar?

I am using Dask in the python REPL. Is it possible to set a progress bar?
power
  • 1,680
  • 3
  • 18
  • 30
-2
votes
1 answer

How to resolve Kernel Error or Memory Error?

I had and array of strings whose length is 50000. I am trying to create a a similarity matrix of dimension 50000 * 500000. In order to make it i tried forming the list of tuples using the following code: terms = [element for element in…
Vas
  • 918
  • 1
  • 6
  • 19
1 2 3
72
73