1

Is there a way to skip plotting NaN and Inf values in python?

Say I have something like:

m1  m2  m3
4   5   2
3   2   3
4   3   4
2   5   0
4   3   8
3   4   0
2   3   4

and I want to plot m1 vs m2/m3 so there are cases I am dividing by zero, so when I go to plot it throws a ValueError: Axis limits cannot be NaN or Inf

What is the solution? Thank you.

(I am using dataframes in pandas where I go m4 = df['m2'] / df['m3'] and go to plot m1 vs m4)

Please note: there are no inf or Nan Values in the dataframe itself, but they occur after manipulation.

destructioneer
  • 150
  • 1
  • 10
sci-guy
  • 2,394
  • 4
  • 25
  • 46
  • Possible duplicate of [dropping infinite values from dataframes in pandas?](https://stackoverflow.com/questions/17477979/dropping-infinite-values-from-dataframes-in-pandas) – Chris Sep 29 '19 at 03:55
  • It's not, as that removes it from the data frame itself. This is plotting after manipulation – sci-guy Sep 29 '19 at 04:03

2 Answers2

1

According to what tools provides us Pandas, the easiest solution is to:

  • replace Inf values (both positive and negative) with NaN,
  • drop NaN values.

They your Series will contain neither Infs nor NaNs.

So, generating m4, expand your instruction to:

m4 = (df['m2'] / df['m3']).replace([np.inf, -np.inf], np.nan).dropna()

getting (in your case):

0    2.500000
1    0.666667
2    0.750000
4    0.375000
6    0.750000
dtype: float64

(no values for key 3 and 5), which can be plotted as you wish.

Edit

I noticed that (at least in Pandas v. 0.25) a DataFrame scatter plot can be generated even with Inf or NaN values in y column.

I did the following experiment:

  1. Created m4 without removing Inf / Nan values, but setting its name (will be needed in a moment in join):

    m4 = df.m2.divide(df.m3).rename('m4')
    
  2. Replaced the second occurrence of Inf with NaN:

    m4.iat[5] = np.nan
    

    so now it contains both Inf and NaN.

  3. Generated the plot:

    df.join(m4).plot.scatter(x='m1', y='m4');
    

In effect, I got the picture, without any errors.

Apparently, any Inf / NaN values are in this case silently dropped and only after that the picture is generated.

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41
0

What you can do is to yield the rows where all its elements are finite.

import pandas as pd, numpy as np 
m1 = [4 ,3 ,4 ,2 ,4 ,3 ,2]
m2 = [5 ,2 ,3 ,5 ,3 ,4 ,3]
m3 = [2, 3, 4, 0, 8, 0, 4]
df = pd.DataFrame({'m1':m1,'m2':m2,'m3':m3})

Say m4 = m2/m3:

df['m4']= df.m2/df.m3

...and MAY CONTAIN infinite elements. To solve this problem, use NumPy’s isfinite on your data frame df and then use .all(1), which returns TRUE if all cells in a row are finite:

df[np.isfinite(df).all(1)]

For visualization purposes, please see the attached screenshots.

enter image description here enter image description here

After this, you can now plot m1 vs. m4 without worrying about the infinite values: enter image description here

Hope this helps!

JP Maulion
  • 2,454
  • 1
  • 10
  • 13