1

I have DataFrame like this:

        product_id          dt  products_qty  stock_qty
0          8225  2017-10-16         12.000    13.000
1          8280  2017-10-16         0.000     11.000
2          8225  2017-10-17         0.000     41.000
3          8280  2017-10-17         7.134     64.698
4          8225  2017-10-18         1.000      8.000
5          8280  2017-10-18         2.728     27.417
6          8225  2017-10-19         0.000     41.000
7          8280  2017-10-19         1.000     -2.000
8          8225  2017-10-20         2.000     -7.000
9          8280  2017-10-20         1.000     25.000
10         8225  2017-10-21         0.000     41.000
11         8280  2017-10-21         0.000     11.000

I have to get all rows where products_qty equals 0 and stock_qty values are the same. So in this case I shoul get DataFrame like this:

            product_id          dt  products_qty  stock_qty
    0          8280  2017-10-16         0.000     11.000
    2          8225  2017-10-17         0.000     41.000
    6          8225  2017-10-19         0.000     41.000
    10         8225  2017-10-21         0.000     41.000
    11         8280  2017-10-21         0.000     11.000

Thanks for help!

1 Answers1

0

Use boolean indexing with chained conditions - second by duplicated with parameter keep=False for check all duplicates:

print (df)
    product_id          dt  products_qty  stock_qty
0      2948225  2017-10-16        12.000     13.000
1      2948280  2017-10-16         0.000     11.000
2      2948225  2017-10-17         0.000     41.000
3      2948280  2017-10-17         7.134     64.698
4      2948225  2017-10-18         1.000      8.000
5      2948280  2017-10-18         2.728     27.417
6      2948225  2017-10-19         0.000     41.000
7      2948280  2017-10-19         1.000     -2.000
8      2948225  2017-10-20         2.000     -7.000
9      2948280  2017-10-20         1.000     25.000
10     2948225  2017-10-21         0.000     13.000 <- changed to 13
11     2948280  2017-10-21         0.000     11.000

#if need check duplicates in all column data
df1 = df[(df['products_qty'] == 0) & (df['stock_qty'].duplicated(keep=False))]
print (df1)
    product_id          dt  products_qty  stock_qty
1      2948280  2017-10-16           0.0       11.0
2      2948225  2017-10-17           0.0       41.0
6      2948225  2017-10-19           0.0       41.0
10     2948225  2017-10-21           0.0       13.0 <- because dupe with first row
11     2948280  2017-10-21           0.0       11.0

#if need check only duplicates in rows with 0 in products_qty

df2 = (df[df.loc[df['products_qty'] == 0, 'stock_qty']
           .duplicated(keep=False).reindex(df.index, fill_value=False)])
print (df2)
    product_id          dt  products_qty  stock_qty
1      2948280  2017-10-16           0.0       11.0
2      2948225  2017-10-17           0.0       41.0
6      2948225  2017-10-19           0.0       41.0
11     2948280  2017-10-21           0.0       11.0

df2 = df[df['products_qty'] == 0]
df2 = df2[df2['stock_qty'].duplicated(keep=False)]
print (df2)
    product_id          dt  products_qty  stock_qty
1      2948280  2017-10-16           0.0       11.0
2      2948225  2017-10-17           0.0       41.0
6      2948225  2017-10-19           0.0       41.0
11     2948280  2017-10-21           0.0       11.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252