I am using coiled to spin up a cluster and using dask to do some manipulation on a csv read from an S3 bucket. However, at some point my workers are getting killed. When I inspected the logs, the following task is killing them.
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 65, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 70, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 71, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 86, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 1, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 8, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 45, 0) marked as failed because 3 workers died while trying to run it
distributed.scheduler - INFO - Task ('read-csv-values-values-00474dd1e867972e5b6636ffb4e71705', 39, 0) marked as failed because 3 workers died while trying to run it
So, then, I moved the csv out of the s3 bucket to my local repo and ran it and still the read csv would fail.
Another point is that the read csv was working properly for priors data manipulation but for some dummy encoders, .compute() and date manipulation, the workers are getting killed.
Any idea what might be going on?