2

I have a pandas data-frame (Pre_Final_DataFrame) that I am writing to excel.

I need to highlight a row in Excel if that corresponding row has a "No Match" word on any of the column that starts with 'Result_'.

So, I decided to go for an array to understand which one needed to be highlighted.

But now, I would prefer a way to highlight using a function as it is too slow. Kindly help me with this.

In Simple words, I am writing a dataframe to excel using Pandas and it has million records and I want a row to be highlighted in "Yellow" only when there is a No Match value present in any one of the column that has a name starting with " Result_"

The Expected result appears in excel looks like below,

Input codes to start with a dataframe:-

import pandas as pd

data = {
     'ColA':[1, 1], 
     'ColB':[1, 1],
     'Result_1':['Match', 'Match'], 
     'ColA1':[1, 2], 
     'ColB1':[1, 1],
     'Result_2':['No Match', 'Match'],  
     }

Pre_Final_DataFrame = pd.DataFrame(data)

ResultColumns_df = Pre_Final_DataFrame.filter(like='Result_')
ResultColumns_df_false =ResultColumns_df[ResultColumns_df.values  == "No Match"]
RequiredRows_Highlight = ResultColumns_df_false.index.tolist()

writer = pd.ExcelWriter(OutputName,date_format='%YYYY-%mm-%dd',datetime_format='%YYYY-%mm-%dd')
Pre_Final_DataFrame.to_excel(writer,'Sheet1',index = False)
writer.save()

Output Expected:

enter image description here

Rojo
  • 23
  • 1
  • 5

1 Answers1

6

We can use the StyleFrame package for reading it into an excel sheet.

import pandas as pd
from StyleFrame import StyleFrame, Styler

df = pd.read_excel("Your Excel Sheet")
sf = StyleFrame(df)

style = Styler(bg_color='yellow') 
for col in df:
     sf.apply_style_by_indexes(sf[sf[col]== 'No Match'],styler_obj=style)

sf.to_excel('test.xlsx').save()

This has helped me in getting an output excel sheet with all the rows highlighted that contains at least one column with value No Match.

Hope this helps. Cheers

Sumanth
  • 497
  • 4
  • 12
  • 1
    It gives me this error - 'Styler' object has no attribute 'to_excel' on this line new_df.to_excel(writer,'Sheet1',index = False) – Rojo Jan 18 '19 at 14:41
  • `new_df` is a pandas.style.Styler object. Presently pandas do not have a functionality of writing style object to an excel sheet. I think It looks like the openpyxl engine can do the work: [link] (https://pandas.pydata.org/pandas-docs/stable/style.html) – Sumanth Jan 18 '19 at 15:29
  • 1
    Now, I don't understand how we write this new_df to a new file using openpyxl. This link gave me some Idea, but it is not helping - https://stackoverflow.com/questions/36657288/copy-pandas-dataframe-to-excel-using-openpyxl – Rojo Jan 18 '19 at 15:57
  • @Rojo I have updated the answer, I have got the highlighted excel sheet as output. Hope it helps. – Sumanth Jan 18 '19 at 18:12