0

What would be the best way to compare two columns and highlight if there is a difference between two columns in dataframe?

df = pd.DataFrame({'ID':['one2', 'one3', 'one3', 'one4' ],
                   'Volume':[5.0, 6.0, 7.0, 2.2],
                   'BOX':['one','two','three','four'],
                   'BOX2':['one','two','five','one hundred']})

I am trying to compare the BOX column and BOX2 column and I'd like to highlight the differences between them.

himnaejah
  • 39
  • 1
  • 8
  • 3
    What should "highlighting" look like? –  Mar 16 '22 at 15:28
  • 1
    i was thinking of using the dataframe.style, so something like this. https://www.analyticsvidhya.com/blog/2021/06/style-your-pandas-dataframe-and-make-it-stunning/ – himnaejah Mar 16 '22 at 15:33

2 Answers2

2

Maybe you can do something like this:

df.style.apply(lambda x: (x != df['BOX']).map({True: 'background-color: red; color: white', False: ''}), subset=['BOX2'])

Output (in Jupyter):

enter image description here

1

You might try something like:

def hl(d):
    df = pd.DataFrame(columns=d.columns, index=d.index)
    df.loc[d['BOX'].ne(d['BOX2']), ['BOX', 'BOX2']] = 'background: yellow'
    return df
    
df.style.apply(hl, axis=None)

output:

enter image description here

for the whole row:

def hl(d):
    df = pd.DataFrame(columns=d.columns, index=d.index)
    df.loc[d['BOX'].ne(d['BOX2'])] = 'background: yellow'
    return df
    
df.style.apply(hl, axis=None)

output:

enter image description here

mozway
  • 194,879
  • 13
  • 39
  • 75