0

I would like to ask a question for a numpy array below.

I have a dataset, which has 50 rows and 15 columns and I created a numpy array as such:

x=x.to_numpy()

I want to compare rows with each other (except than itself), then found the number of rows which satisfies following condition:

there is no other row that

-values are both smaller

-if one is equal, the other one should be smaller

Money Weight
10 80
20 70
30 90
25 50
35 10
40 60
50 10

for instance for row 1: there is no other row which are both smaller on two columns, if one is smaller on the other column row 1 is smaller on the other. Satisfies the condition

for row 3: there is no other row which are both smaller on two columns, it is equal on column weight with row 6 but on money dimension it is smaller. Satisfies the condition

for row 6: there is no other row which are both smaller on two columns. it is equal on weight dimension with row 3 but the value in money is greater. Does not satisfy the condition

I need a number of rows which satisfies the condition in a numpy array.

I have tried bunch of codes bot could not find a proper algorithm. So if anyone has an idea on that I would be appreciated.

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76

1 Answers1

1

IIUC, you can do the following:

mask = (arr <= arr[:, None]).all(2).sum(1) < 2
res = df[mask]
print(res)

Output

   Money  Weight
0     10      80
1     20      70
3     25      50
4     35      10

Breakdown

# pairwise comparison between rows (elementwise)
comparison = (arr <= arr[:, None])

# reduce to find only rows that have all values lower
lower_values = comparison.all(2)

# count the number of rows with lower values
total_lower = lower_values.sum(1)

# leave only those that include one row (itself)
mask = total_lower <= 1

# filter the original DataFrame
res = df[mask]

print(res)
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
  • Thank you for the proposed solution! It works perfect for me. I have a small update on my question. What if a n new comes in, which is outside the table, and i would check whether this new row satisfy the specified the condition. How should we update the solution to find it as well? For example new row comes in which is 20,20 and I want to check whether it satisfy the condition within the given table and returns as True or False. Thanks! – Dinc Kirikci Dec 17 '22 at 14:34