Questions tagged [modin]

Modin is a project to speed up pandas workflows only by changing a single import statement.

Modin is a project to speed up pandas workflows only by changing a single import statement. Peruse the documentation at https://modin.readthedocs.io/.

80 questions
1
vote
1 answer

Top level imports supersede lower level imports?

In a jupyter notebook, I have import modin.pandas as pd import utils utils.py has import pandas as pd Does the pd in utils.py import pandas, or modin.pandas? If the former, is there a way for me to make utils.py use modin.pandas from the jupyter…
piedpiper
  • 1,222
  • 3
  • 14
  • 27
1
vote
0 answers

modin incorrect syntax error on read_sql with oracle

Using modin's implementation of read_sql, I am getting a syntax error. It seems to be due to the aliasing of the count query generated in read_sql. Am I doing something wrong or is this due to a lack of compatibility with oracle? Also, note that…
kjmena
  • 11
  • 3
1
vote
2 answers

Modin with ray for pandas working in command prompt but not on Idle, no error code

I try to use modin unstead of pandas to "parallelize by changing a single line of code" I'm using IDLE and when I run this code : import os os.environ["MODIN_ENGINE"] = "ray" import ray ray.init() import modin.pandas as…
1
vote
1 answer

How to replace type: pandas.core.frame.DataFrame with type: modin.pandas.dataframe.DataFrame

I try to replace pandas with modin pandas in the code: if not isinstance(X, pd.DataFrame): raise TypeError( "X is not a pandas dataframe. The dataset should be a pandas dataframe.") but the error is: DataFrame Expected type
Peter Pirog
  • 142
  • 1
  • 8
1
vote
1 answer

Modin AttributeError when importing from sparse matrix

I am trying to use Modin package to import a sparse matrix created with scipy (specifically, a scipy.sparse.csr_matrix). Invoking the method: from modin import pandas as pd pd.DataFrame.sparse.from_spmatrix(mat) I am getting the following…
1
vote
1 answer

Pandas result of findall to single row

hello I have csv file and I using pandas and my issue is when I using pandas.Series.str.findall. What I wont is after call findall I would like to save value of result (what is array) to row in csv this is my code data = pd.read_csv("input.csv") …
EagleCode
  • 125
  • 1
  • 9
1
vote
1 answer

Using Prophet or Auto ARIMA with Ray

There is something about Ray that I could not find a clear answer. Ray is a distributed framework for dataprocessing and training. In order to make it work in a distributed fashion Modin or some other distributed data analysis tool supported by Ray…
M.Erkin
  • 120
  • 1
  • 6
1
vote
1 answer

Transform python response to Json response

I've worked on a python code that automates data frames reading for multiple extensions and prints the DF's first 100 lines as well as the Types of it's columns with the possibility to add more things within the same simple function, I'm currently…
Hamza
  • 41
  • 5
1
vote
1 answer

Why does it take longer than using Pandas when I used modin.pandas [ray]

I'm just a Python newbie who's had fun dealing with data with Python. When I was be able to use Python's representative data tool, Pandas, it seemed that it would be able to work on Excel very quickly. However, I was somewhat disappointed to see it…
ThanksAudit
  • 27
  • 1
  • 4
1
vote
1 answer

remove rows from one dataframe based on conditions from another dataframe in pandas Python

I have two pandas data frame contains millions of rows in python. I want to remove rows from the first data frame that contains words in seconds data frame based on three conditions: If the word appears at the beginning of the sentence in a row If…
Tanmay Jain
  • 123
  • 1
  • 2
  • 12
1
vote
1 answer

modulenotfounderror no module named 'modin'

I have created a virtual environment with following syntax in the windows terminal: conda create --prefix ./modinenv python=3.6 numpy conda activate e:\modin\modinenv pip install modin[dask] jupyter notebook In a new python file, when i executed…
gopinath
  • 13
  • 4
1
vote
0 answers

Modin vs threading for pandas DataFrame

I have a DataFrame with 343,500 records and a predefined get_zipcode function. In order to speed up the apply, I split the data in four and created the following threaded process using the threading module: df['subsections'] = np.resize([1,2,3,4],…
Yehuda
  • 1,787
  • 2
  • 15
  • 49
1
vote
1 answer

str[0:z] works with pandas but not with modin: TypeError: 'StringMethods' object is not subscriptable

I'm running Spyder on Python 3.7 and am new to modin. I want to retrieve the first characters in a string and save to a new column. When I run the usual with pandas it works: import pandas as pd data = pd.read_csv('Path/data.csv', dtype=str,…
pandini
  • 69
  • 7
1
vote
0 answers

Unable to connect to Redis when running modin.pandas from PyCharm

After installing modin on my Windows machine (pip install modin[ray]), I can run simple examples on a jupyter notebook, but it fails when running from PyCharm. I get an exception: Unable to connect to Redis. Any suggestion? ### Read in the data with…
Pebeto
  • 140
  • 5
1
vote
1 answer

Does Modin speedup Pandas Apply function?

I have tried to find answer in many places, but never got direct answer yet. Does modin Speedup apply on Dataframes? Is it having intelligency to parallerize apply function across Dataframe rather than doing typical row by row? Or Should we go for…
Hari Prasad
  • 1,751
  • 2
  • 15
  • 20