Count ocurrencies of pattern in pandas dataframe based on condition

Question

Consider this dataframe:

pd.DataFrame(['A(3)BC(1)', 'A(2)BC(5)', 'A(1)BC(3)', 'A(2)BC(5)', 'A(4)BC(2)'], columns=['Column1'])

    Column1
0   A(3)BC(1)
1   A(2)BC(5)
2   A(1)BC(3)
3   A(2)BC(5)
4   A(4)BC(2)

Is there a way to count the number of times A has a number higher than (3) without iterating through every line of the dataframe?

score 3 · Accepted Answer · answered Aug 09 '22 at 18:08

3

Let's try

out = df['Column1'].str.extract('A\((\d+)\)')[0].astype(int).gt(3).sum()

print(out)

1

answered Aug 09 '22 at 18:08

Ynjxsjmh

28,441
6
34
52

Worked perfectly. Is it possible to make a combination of A and B? To find A>3 and B<5? – Lucas Lazari Aug 09 '22 at 18:55
1

@LucasLazari You can similarly extract `B` times and check https://stackoverflow.com/questions/48978550/pandas-filtering-multiple-conditions. – Ynjxsjmh Aug 09 '22 at 18:58

score 1 · Answer 2 · answered Aug 09 '22 at 18:09

1

If the format is always A(X) at the start, you can do:

df['Column1'].apply(lambda st: int(st[st.find("A")+2:st.find(")")])).gt(3).sum()

Output

1

answered Aug 09 '22 at 18:09

Yuca

6,010
3
22
42

Count ocurrencies of pattern in pandas dataframe based on condition

2 Answers2