1

I'd like to replace all values in a df that are between -0.5 and 0.5 with NaNs.

For just the latter condition this solution works nicely:

df[df < 0.5] = np.nan

I can't, however, figure out how to add a second condition like:

df[-0.5 < df < 0.5] = np.nan

Any help would be greatly appreciated!

Thanks.

kxp
  • 43
  • 1
  • 6
  • 3
    Have you tried: `df[(df < 0.5) & (df > -0.5)]`? – pault Feb 15 '18 at 19:36
  • That works perfectly. Thanks @pault – kxp Feb 15 '18 at 20:32
  • Possible duplicate of [Efficient way to apply multiple filters to pandas DataFrame or Series](https://stackoverflow.com/questions/13611065/efficient-way-to-apply-multiple-filters-to-pandas-dataframe-or-series) – pault Feb 15 '18 at 20:38

1 Answers1

2

All you need is to index based on two conditions, df < 0.5 and df > -0.5, such as this:

df[(df < 0.5) & (df > -0.5)] = np.nan

for instance:

import pandas as pd
import numpy as np
# Example df
df = pd.DataFrame(data={'data1':2*np.random.randn(100),
                    'data2':2*np.random.randn(100)})

# Show example with all values as original
>>> df.head(10)
      data1     data2
0 -0.113909  3.625936
1 -2.795349 -1.362933
2 -3.750103  2.686047
3  3.286711 -2.937002
4 -0.279161 -2.255135
5 -0.394181  3.937575
6 -1.166115  0.776880
7 -2.750386  0.681216
8  1.375598 -1.070675
9 -0.871180 -0.122937


df[(df < 0.5) & (df > -0.5)] = np.nan

# Show df with NaN when between -0.5 and 0.5
>>> df.head(10)
      data1     data2
0       NaN  3.625936
1 -2.795349 -1.362933
2 -3.750103  2.686047
3  3.286711 -2.937002
4       NaN -2.255135
5       NaN  3.937575
6 -1.166115  0.776880
7 -2.750386  0.681216
8  1.375598 -1.070675
9 -0.871180       NaN
sacuL
  • 49,704
  • 8
  • 81
  • 106
  • Perfect. Thanks! – kxp Feb 15 '18 at 20:02
  • 2
    Glad I could help! If it solves your problem, feel free to [accept the answer](https://meta.stackexchange.com/questions/16721/how-does-accept-rate-work/65088#65088)! – sacuL Feb 15 '18 at 20:03