Questions tagged [modin]

Modin is a project to speed up pandas workflows only by changing a single import statement.

Modin is a project to speed up pandas workflows only by changing a single import statement. Peruse the documentation at https://modin.readthedocs.io/.

80 questions
0
votes
1 answer

Can I set the isolation level of modin's parallel read_sql function?

I have some python code that I am trying to use to read uncommitted from my database in parallel using sqlalchemy and modin. I have tried calling the function as: df = pd.read_sql("select * from my_table", uri_string, params={'isolation_level':…
0
votes
0 answers

Pandas version is different in Anaconda Navigator and Anaconda Prompt

I tried importing modin in my jupyter notebook. when I run the cell import modin.pandas as pd I get the following warning UserWarning: The pandas version installed 1.2.3 does not match the supported pandas version in Modin 1.2.4. This may cause…
cva
  • 5
  • 3
0
votes
2 answers

Modin conflicts with dask

I'm trying modin, but keep getting an error: import modin.pandas as md import pandas as pd PATH = 'file.csv' %%time df = pd.read_csv(PATH) %%time mdf = md.read_csv(PATH) error: UserWarning: Dask execution environment not yet initialized.…
Oleg Peregudov
  • 137
  • 2
  • 8
0
votes
1 answer

Intel Modin Installation

How to install the Intel Distribution of Modin? I want to install Intel AI Kit Modin using an existing Conda-based python environment. What are the steps required to install Modin after activating conda environment?
AlekhyaV - Intel
  • 580
  • 3
  • 21
0
votes
1 answer

Modin df iterrows is painfully slow. Any alternative to speed it up?

I have a modin dataframe having ~120k rows. I want coalesce some columns of it. Modin df iterrows is taking lot of time, so I tried with numpy.where. Numpy.where is on the equivalent pandas df does it in 5-10 minutes but same thing on modin df takes…
Vikas Garud
  • 143
  • 2
  • 10
0
votes
1 answer

ImportError: cannot import name 'Flags' from 'pandas'

I ran into the below when trying to import pandas from modin on mac os import modin.pandas as pd. what is the possible fix for this? error traceback ImportError Traceback (most recent call…
chuky pedro
  • 756
  • 1
  • 8
  • 26
0
votes
1 answer

How to solve type object 'Series' has no attribute '_get_dtypes' error using modin.pandas?

I am using modin.pandas to remove the duplicates from dataframe. import modin.pandas as pd import json, ast df = pd.DataFrame(columns=['contact_id', 'test_id']) df['test_id'] = df['test_id'].astype(str) # Coverting test_id column data type to…
Sangram Badi
  • 4,054
  • 9
  • 45
  • 78
0
votes
2 answers

What can be modin used for?

I have been looking at parallelizing options and found ray and modin. After some tests I got slightly lost in what benefits from using modin. Two examples: df = pd.read_csv() for 180 MB file pandas 5.2s vs. modin.pandas 2.7s but df.groupby() pandas…
0
votes
1 answer

Pandas string subscripting does not work in modin (and related questions about converting pandas code to modin)

I recently learned about modin, and am trying to convert some of my code from pandas to modin. My understanding is that modin has some operations that run faster and others that it has not optimized, so it defaults to pandas for those. Thus anything…
amquack
  • 837
  • 10
  • 24
0
votes
1 answer

Unhashable type: Series when using modin with pandas?

I am in Anaconda on Windows 10; I installed via: conda install -c anaconda dask conda install -c conda-forge modin conda update conda conda update anaconda conda update dask conda install -c conda-forge pandas=1.0.5 # this will also download modin…
sdbbs
  • 4,270
  • 5
  • 32
  • 87
0
votes
1 answer

Read large xlsx with 2 sheets or csv into dataframe

I have a xlsx file with 11 columns and and 15M rows and 198Mb in size. It's taking forever with pandas to read and work. After reading Stackoverflow answers, I switched to dask and modin. However, I',m receiving the following error when using…
Vishal Kamlapure
  • 590
  • 4
  • 16
0
votes
2 answers

Cant fit dataframe with fbprophet using dask to read the csv into a dataframe

References: https://examples.dask.org/applications/forecasting-with-prophet.html?highlight=prophet https://facebook.github.io/prophet/ A few things to note: I've got a total of 48gb of ram Here are my versions of the libraries im using Python…
Nubonix
  • 71
  • 7
0
votes
1 answer

How to optimize this pandas iterable

I have the following method in which I am eliminating overlapping intervals in a dataframe based on a set of hierarchical rules: def disambiguate(arg): arg['length'] = (arg.end - arg.begin).abs() df = arg[['begin', 'end', 'note_id',…
horcle_buzz
  • 2,101
  • 3
  • 30
  • 59
0
votes
1 answer

ModuleNotFoundError for 'modin' even though it is installed by poetry

On import modin.pandas as modin_pd line I get ModuleNotFoundError: No module named 'modin'. I am using poetry & JupyterLab. If in the cell I type !poetry add modin, I get ValueError saying Package modin is already present. So it cannot install modin…
Valeria
  • 1,508
  • 4
  • 20
  • 44
0
votes
0 answers

How do I check if pandas import is modin or original

While doing some OLS regressions, I discovered that statsmodels.api.add_constant() does the following: if _is_using_pandas(data, None) or _is_recarray(data): from statsmodels.tsa.tsatools import add_trend return add_trend(data, trend='c',…
s5s
  • 11,159
  • 21
  • 74
  • 121