0

I'm working with Dask on Kubernetes using the Helm Chart in the stable/dask repository. When using the distributed Client, and calling client.scatter(ddf), I'm getting and an Exception as follows:

Exception: No module named 'pandas.core.internals.managers'; 'pandas.core.internals' is not a package

Review of the installed packages shows Pandas==0.24.1 & dask-core==1.1.1 on Python 3.7.

Looking at the memory consumption on the workers suggests that nothing is being sent to the workers, and when I add the keyword='broadcast', I can observe a short-term rise in memory usage on a second worker, but then I get the error cited above.

Any suggestions for what I'm doing wrong, or is this an issue with Dask/Pandas?

Thanks.

GHayes
  • 55
  • 5

1 Answers1

0

My guess is that the versions of Pandas that you have on different machines differs. You can check this with the following command.

client.get_versions(check=True)
MRocklin
  • 55,641
  • 23
  • 163
  • 235