Highest Voted 'dask-delayed' Questions

0

votes

0 answers

Persisting in memory dask delayed without starting the computation yet

I have multiple computation trees in my python toolkit, but not all are requiered for the current analysis: a1 = build_a1().persist() a2 = build_a2(a1).persist() a3 = build_a3(a2) b1 = build_b1().persist() b2 = build_b2(b1).persist() b3 =…

python dask-delayed

asked Oct 14 '21 at 09:19

epizut

3
3

0

votes

1 answer

How to connect to oralce database and import the data into csv format using dask?

How can I connect to oracle database using dask and fetch the data from it and create a csv file using the fetched data.

dask dask-distributed dask-delayed dask-dataframe dask-ml

asked Sep 29 '21 at 11:15

Hemanth Kumar

1
4

0

votes

1 answer

How can I use dask_ml preprocessing in a dask distributed cluster

How can I do dask_ml preprocessing in a dask distributed cluster? My dataset is about 200GB and Every time I categorize the dataset preparing for OneHotEncoding, it looks like dask is ignoring the client and try to load the dataset in the local…

dask dask-distributed dask-delayed dask-dataframe dask-ml

asked Jul 09 '21 at 15:19

wml

1

0

votes

1 answer

Datetime index-based slicing with Dask

I have two dataframes: links has two datetime columns called onset and offset and each row is an event. The other dataframe is called sensors, is indexed with datetime index of freq 1m, and has ~600 columns, each for a sensor-id. Essentially, for…

pandas dask-delayed dask-dataframe

asked May 06 '21 at 14:25

estraven

1
1

0

votes

0 answers

Dask Cluster not processing any data and just sitting idle after a while, which was working perfectly fine couple of weeks before

So I'm trying to parallelize the process using the dask cluster. Here's my try. Getting clusters ready: gateway = Gateway( address="http://traefik-pangeo-dask-gateway/services/dask-gateway", …

python amazon-web-services dask python-xarray dask-delayed

asked Apr 10 '21 at 21:45

Chris_007

829
11
29

0

votes

1 answer

BUG: Dask K-means Exception heppen Too many indices for array

I am using K-means clustering on a dataset with shape (563, 207383) via Dask-K-means (CPU based), and am getting the following error: "Dask K-means Exception heppen Too many indices for array" But when I use RapidsAI dask_k-means (GPU Based) it…

dask dask-distributed dask-delayed dask-ml

asked Mar 22 '21 at 06:15

Vivek kala

23
3

0

votes

1 answer

Nested dask delayed or futures

Looking for best practice for nested parallel jobs. I couldn't nest dask delayed or futures so I mixed both to get it to work. Is this not recommended? Is there better way to do this? Example: import dask from dask.distributed import Client import…

python dask dask-distributed dask-delayed

asked Feb 05 '21 at 12:47

J.Sung

27
5

0

votes

1 answer

Creating dask dataframe from delayed dask arrays

I've got a list of delayed dask arrays stored in dask_arr_ls that I want to turn into a dask dataframe. Here's a skeleton of my pipeline: def simulate_device_data(num_id): # create data for unknown number of timestamps data_ls =…

dask dask-delayed dask-dataframe

asked Jan 21 '21 at 06:32

Chris Raper

21
5

0

votes

1 answer

Using DASK to read files and write to NEO4J in PYTHON

I am having trouble parallelizing code that reads some files and writes to neo4j. I am using dask to parallelize the process_language_files function (3rd cell from the bottom). I try to explain the code below, listing out the functions (First 3…

neo4j dask dask-distributed dask-delayed python-3.9

asked Jan 12 '21 at 01:07

Vaibhav yB Shah

7
4

0

votes

1 answer

Display progress on dask.compute(*something) call

I have the following structure on my code using Dask: @dask.delayed def calculate(data): services = data.service_id prices = data.price return [services, prices] output = [] for qid in notebook.tqdm(ids): r =…

dask dask-distributed dask-delayed dask-dataframe

asked Jan 07 '21 at 20:52

Jorge Nachtigall

501
4
20

0

votes

1 answer

How to add/append a row to a particular partition in the dask dataframe?

I want to append a row to a particular partition in dask dataframes. I have tried out many methods but none of them are possible. Can anyone help me on this. Thanks in advance I tried - first_partition = df.partitions[0] new_dd =…

python dask dask-distributed dask-delayed dask-dataframe

asked Jan 04 '21 at 20:09

Srimanth

13
3

0

votes

1 answer

Reading large volume data from Teradata using Dask cluster/Teradatasql and sqlalchemy

I need to read large volume data(app. 800M records) from teradata, my code is working fine for a million record. for larger sets its taking time to build metadata. Could someone please suggest how to make it faster. Below is the code snippet which I…

python-3.x dask dask-distributed dask-delayed teradatasql

asked Dec 19 '20 at 11:38

Reetesh Nigam

133
2
2
15

0

votes

1 answer

dask broadcast variable fails with key error when calculating subset of pandas dataframe

I have a pandas data frame and want to apply a costly operation to each group. Therefore, I want to parallelize this task using dask. The initial data frame should be broadcasted. But the computation only fails with:

python pandas dask broadcast dask-delayed

asked Dec 13 '20 at 15:30

Georg Heiler

16,916
36
162
292

0

votes

1 answer

dask handle delayed failures

How can I port the following function to dask in order to parallelize it? from time import sleep from dask.distributed import Client from dask import delayed client = Client(n_workers=4) from tqdm import tqdm tqdm.pandas() # linear things =…

python dask future dask-delayed

asked Dec 06 '20 at 16:58

Georg Heiler

16,916
36
162
292

0

votes

1 answer

How many dask jobs per worker

If I spin up a dask cluster with N workers and then submit more than N jobs using cluster.compute, does dask try to run all the jobs simultaneously (by scheduling more than 1 job on each worker) or are the jobs queued and run sequentially ? My…

dask dask-distributed dask-delayed

asked Nov 22 '20 at 21:25

firdaus

541
1
6
13

Prev 1 2 3

…

19 20 Next

Questions tagged [dask-delayed]