Comparing values from a DataFrame agains another
Given the following data
data_df = pd.DataFrame({"Reference": ("A", "A", "A", "B", "C", "C", "D", "E"), "Other col": ("U", "U", "U--","V", "W", "W--", "X", "Y")}, index=[1, 2, 3, 4, 5, 6, 7, 8])
truth_df = pd.DataFrame({"Reference": ("A", "B", "C", "D", "E"), "Other col": ("U", "V", "W", "X", "Y")}, index=[1, 4, 5, 7, 8])
data_df
Reference | Value | |
---|---|---|
1 | A | U |
2 | A | U |
3 | A | Ux |
4 | B | V |
5 | C | W |
6 | C | Ww |
7 | D | X |
8 | E | Y |
truth_df
Reference | Value | |
---|---|---|
1 | A | U |
4 | B | V |
5 | C | W |
7 | D | X |
8 | E | Y |
I need to check and flag that the values in data_df
match that of truth_df
and hopefully end up with a new data set like:
result_df
Reference | Value | Issues | |
---|---|---|---|
1 | A | U | |
2 | A | U | |
3 | A | Ux | Wrong |
4 | B | V | |
5 | C | W | |
6 | C | Ww | Wrong |
7 | D | X |