I have a list (list_to_match = ['a','b','c','d']
) and a dataframe like this one below:
Index | One | Two | Three | Four |
---|---|---|---|---|
1 | a | b | d | c |
2 | b | b | d | d |
3 | a | b | d | |
4 | c | b | c | d |
5 | a | b | c | g |
6 | a | b | c | |
7 | a | s | c | f |
8 | a | f | c | |
9 | a | b | ||
10 | a | b | t | d |
11 | a | b | g | |
... | ... | ... | ... | ... |
100 | a | b | c | d |
My goal would be to filter for the rows with most matches with the list in the corrisponding position (e.g. position 1 in the list has to match column 1, position 2 column 2 etc...). In this specific case, excluding row 100, row 5 and 6 would be the one selected since they match 'a', 'b' and 'c' but if row 100 were to be included row 100 and all the other rows matching all elements would be the selected. Also the list might change in length e.g. list_to_match = ['a','b'].
Thanks for your help!