Is there a pandas function for repeated values?

Question

consider series x_1,x_2,x_3,x_4... I want to set x_i as NaN if x_i = x_{i+1}.... I don't care if x_2 equals, say, x_9. For a second or two, I had thought this was the meaning of duplicate values but I now see that it would care about x_9. I'm pretty sure this routine must already exist in pandas, but I can't find it.

def ff_repeated(xnp):
    nfnp = xnp.size
    ffnp = np.empty(nfnp,dtype=bool)
    ffnp[0] = False
    for i in range(1,nfnp):
        ffnp[i] =  xnp[i] == xnp[i-1] 
    return ffnp

Thoughts? How I use the above is then

ffnp = ff_repeated(dm.loc["Pressure"].values)
dm.loc["Pressure",ffnp] = np.NaN

Maybe take a look at [this](https://stackoverflow.com/questions/30673209/pandas-compare-next-row) — Scratch'N'Purr, Feb 22 '20 at 12:56

score 1 · Accepted Answer · answered Feb 22 '20 at 12:58

1

Your version should work just fine, but it involves a for loop and therefore is inherently slow. You can make use of vectorization by simply shifting the pd.Series and comparing afterwards:

xnp = pd.Series([1,2,3,3,4,2,5,5,6])
ffnp = xnp.shift(1) == xnp

ffnp

0    False
1    False
2    False
3     True
4    False
5    False
6    False
7     True
8    False

You can then use ffnp to set the values to nan as you did

answered Feb 22 '20 at 12:58

Lukas Thaler

2,672
5
15
31

Exactly what I needed. For an entertaining (?) twist. If you want to always force the last line to be False, dont do ``` ffnp[-1] = False ``` that creates a new index. Correct answer is ffnp.iloc[-1] = False. Gosh that took me an embarrasingly long time to debug... – Tunneller Feb 22 '20 at 23:55

Is there a pandas function for repeated values?

1 Answers1