2

I have files with a lot of weird formatting and I'm trying to create a function that removes any formatting from an xlsx file.

Some guys in here suggested to use "cell.fill = PatternFill(fill_type=None)" to clear any format from a given cell.

path = r'C:\Desktop\Python\Openpyxl\Formatted.xlsx
wb = xl.load_workbook(filename = path)

def removeFormatting(file):
    ws = wb[file]
    
    for row in ws.iter_rows():
        for cell in row:
            if cell.value is None:
                cell.fill = PatternFill(fill_type=None)

    wb.save(path)

for s in wb.sheetnames:
    removeFormatting(s)

But this won't change anything. If the cells are empty but colored, then openpyxl still sees them as non empty.

Following this post: Openpyxl check for empty cell

The problem with ws.max_column and ws.max_row is that it will count blank columns as well, thus defeating the purpose."

@bhaskar was right. When I'm trying to get the max column, I get for all the sheets, the same value as from the first sheet.

col = []    
for sheet in wb.worksheets:
        col.append(sheet.max_column)

So even if there are different sheet dimensions, if the cell has a background color or any other formatting, it will take it as valid non empty cell.

Does anyone know how to solve this?

Thanks in advance!

Raccoon_17
  • 153
  • 2
  • 15

1 Answers1

7

This function removes styles from all cells in a given worksheet (passed as ws object). So you can open your file, iterate over all worksheets and apply this function to each one:

def removeFormatting(ws):
    # ws is not the worksheet name, but the worksheet object
    for row in ws.iter_rows():
        for cell in row:
            cell.style = 'Normal'

If you also want to check info about how to define and apply named styles, take a look here: https://openpyxl.readthedocs.io/en/stable/styles.html#cell-styles-and-named-styles

abu
  • 422
  • 7
  • 14