1

I try to use modin unstead of pandas to "parallelize by changing a single line of code" I'm using IDLE and when I run this code :

import os
os.environ["MODIN_ENGINE"] = "ray"    
import ray
ray.init()
import modin.pandas as pd
pd.read_csv("some_path")

Some command prompt windows open and close (their path refers to ray), then the line : ================================ RESTART: Shell ================================ is shown (with no error code so I can't know what went wrong) After that whatever the pandas command I try to run in the IDLE window, I get the error "NameError : name 'pd' is not defined".

The problem seems to come from IDLE, because I tried to run it directly from command prompt, and the code worked as intended.

So i tried theese solutions, that all failed :

-Computer reboots

-Check if there was several python installations

-Uninstalling, redownloading an reinstalling all modules

-Uninstalled completely python and reinstalling (3.9)

I found log saying the error comes from ray, and that the root cause is logged in dashboard_agent.log

The refered log is not saved at each run, but I found 2 of them and they warn about a missing module.

I installed the missing module, re-ran the script multiple times, and the script is still not working, the logs are still referring to a log that is no longer generated when I try to run the code,at least in 20 attempts .

2 Answers2

1

It appears IDLE gives the RESTART message when a subprocess fails, see https://stackoverflow.com/a/29216224/19027728. To clarify, this happens when you are running the IDLE shell, but not from command prompt? In the command prompt run, are you running a script ie python script.py or are you running python interactively? Have you tried running this with another shell, like IPython?

From your debugging steps, it appears it might be a Ray issue. To confirm, can you try running with Dask instead? If you installed with modin[all] you should have that backend. If not, pip install modin[dask] should work.

If using Ray is a necessity, could you perhaps try sharing some of those Ray debug logs? Also, make sure to call ray.shutdown() when appropriate to avoid spawning redundant Ray instances, which might cause issues.

Jeffrey Li
  • 21
  • 2
  • I go in command prompt and type "python" then I copy paste the code I want to run and it works. I tried to use dask, I got issues too, I also tried to do ray.shutdown() before ray.init() it didn't change anything – All the things she said kekw Jun 09 '22 at 18:41
1

Jeffrey Li is correct about the restart message indicating that IDLE's execution subprocess crashed. The difference from command line Python may be that the execution environment is slightly different when execution via IDLE. (This will be true of any GUI alternative.) My additional suggestions.

  1. Read the Running user code section of the IDLE doc. It is available on IDLE's help menu. One can use sys to investigate the differences mentioned.

  2. Run IDLE from a command line, if not doing so already, with python -m idlelib, where python is adjusted as appropriate for your OS and Python version. The makes the execution environment more similar to the standard environment. So a) you may get an error message on the terminal where you start IDLE and b) you program might behave better.

  3. For the present, develop your pandas application by running pandas directly, without modin. All I know of the latter is that it appears to be an optional optimizer. I know that at least some parts of pandas run fine with IDLE when run directly.

Terry Jan Reedy
  • 18,414
  • 3
  • 40
  • 52
  • The result to 2. is b), the code is running as intended like that. So the issue is caused by the way I run the script in idle : file.py >right clic, edit with idle > run module (F5). – All the things she said kekw Jun 09 '22 at 19:03
  • It then appears that modin or something it imports depends on some sys attribute having a subset of legal values. It could check and exit gracefully with an exception. The command line for IDLE can include a file to edit: `python -m idlelib somepath/file.py`. `somepath/` is not needed if in the directory with file.py. (If this answer solve your problem, could you accept it?) – Terry Jan Reedy Jun 10 '22 at 21:19
  • I tried and it works, this solution is as convenient as copy pasting the code in command prompt, maybe there is a way to change/replace the "Edit with idle" option so it runs python -m idlelib somepath/file.py ? – All the things she said kekw Jun 10 '22 at 21:57
  • No. One has to start IDLE from a running console process to have the IDLE processes connect to the console. – Terry Jan Reedy Jun 11 '22 at 23:05