4

I have attempted to open an excel file in which I need to insert dataframes to certain sheets, while leaving other sheets alone. The script works fine when I tested it on other excel files. When I use it on the one I actually need, I get an error message.

Here is the script:

from openpyxl import load_workbook
book = load_workbook(self.directory)

Self.directory refers to my file location. As you can see in the traceback, it fails already at this line when trying to execute load_workbook(), and gives the following error message:

ValueError: Max value is 14

Here is the relevant traceback (I left the directory locations starting with the virtual environment folder 'virtual'):

"""
book = load_workbook(self.directory)
virtual\lib\site-packages\openpyxl\reader\excel.py", line 217, in load_workbook
shared_strings = read_string_table(archive.read(strings_path))
virtual\lib\site-packages\openpyxl\reader\strings.py", line 22, in read_string_table
text = Text.from_tree(node).content
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\styles\fonts.py", line 110, in from_tree
return super(Font, cls).from_tree(node)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 100, in from_tree
return cls(**attrib)
virtual\lib\site-packages\openpyxl\cell\text.py", line 114, in __init__
self.family = family
virtual\lib\site-packages\openpyxl\descriptors\nested.py", line 36, in __set__ 6, in __set__
super(Nested, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 110, in __set__ , in __set__ 
super(Min, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 89, in __set__ in __set__
raise ValueError('Max value is {0}'.format(self.max))
ValueError: Max value is 14
"""

I realized that the excelfile I was using is over the limit of self.max.

I tried sifting through the the openpyxl scripts myself, but I could not manage to figure out what self.max refers to, or how I can change my Excel File so that I can load the workbook.

Can anyone point me to the right direction?

Thanks in advance!

11 Answers11

3

I had to remove all formatting in a sheet I was working with.

In Libreoffice; select all, "clear direct formatting"

CarpetLoup
  • 31
  • 3
  • I had a similar problem which gave me this error: `'min value is {0}'` for loading an Excel file through `pd.read_excel` which uses _openpyxl_ engine. I got through this by **clearing all formatting** as you suggested here and worked with no problem! Thanks!! – Elias Mar 05 '21 at 09:26
2

Here's what fixed this error for me. I edited lib\site-packages\openpyxl\descriptors\base.py and added a print statement after line 86 in class Max like so:

def __set__(self, instance, value):
    if ((self.allow_none and value is not None)
        or not self.allow_none):
        value = _convert(self.expected_type, value)
        if value > self.max:
            print(f"value is {value}")
            raise ValueError('Max value is {0}'.format(self.max))
    super(Max, self).__set__(instance, value)

This printed the value of 34 which is obviously higher than the max value of 14 (it's a font family value).

I then saved a copy of my Excel spreadsheet with a .zip extension, extracted all of the XML files, and then used grep searching for val="34". This led me to 3 cells which somehow had font-family=34 in them. I changed the font to something else in Excel, saved the spreadsheet, then changed it back to the original font (Arial) and saved.
After all of this, the error was gone.

Mike Divot
  • 21
  • 1
2

Instead of patching the __set__ method you can patch the specific descriptor's max value.

This approach is more surgical in that it only patches the font family descriptor that is causing the error. By comparison, if you patch descriptors\base.py as suggested in other answers, you'll be allowing potentially bad values for all descriptors, not just font family.

# IMPORTANT, you must do this before importing openpyxl
from unittest import mock
# Set max font family value to 100
p = mock.patch('openpyxl.styles.fonts.Font.family.max', new=100)
p.start()
import openpyxl
openpyxl.open('my-bugged-worksheet.xlsx') # this works now!

This approach uses mock but in principle you can also patch the .py file where the openpyxl.styles.fonts.Font.family.max descriptor is contained.

sytech
  • 29,298
  • 3
  • 45
  • 86
2

For Pandas uses - how to clear XLSX styles completely

which will fix all kind of style errors like 'ValueError: Max value is 14'

1. Add your own XLSX engine for pandas in one of your init.py files which is loaded automatically

from openpyxl.reader.excel import ExcelReader
from openpyxl.xml import constants as openpyxl_xml_constants
from pandas import ExcelFile
from pandas.io.excel._openpyxl import OpenpyxlReader

class OpenpyxlReaderWOFormatting(OpenpyxlReader):
    """OpenpyxlReader without reading formatting
    - this will decrease number of errors and speedup process
    error example https://stackoverflow.com/q/66499849/1731460 """

    def load_workbook(self, filepath_or_buffer):
        """Same as original but with custom archive reader"""
        reader = ExcelReader(filepath_or_buffer, read_only=True, data_only=True, keep_links=False)
        reader.archive.read = self.read_exclude_styles(reader.archive)
        reader.read()
        return reader.wb

    def read_exclude_styles(self, archive):
        """skips addings styles to xlsx workbook , like they were absent
        see logic in openpyxl.styles.stylesheet.apply_stylesheet """

        orig_read = archive.read

        def new_read(name, pwd=None):
            if name == openpyxl_xml_constants.ARC_STYLE:
                raise KeyError
            else:
                return orig_read(name, pwd=pwd)

        return new_read

ExcelFile._engines['openpyxl_wo_formatting'] = OpenpyxlReaderWOFormatting
print('Monkey patching pandas XLS engines. See OpenpyxlReaderWOFormatting')

2. Use it, specify engine='openpyxl_wo_formatting' when you read XLSX file

import pandas as pd
df = pd.read_excel(filepath, engine='openpyxl_wo_formatting')
pymen
  • 5,737
  • 44
  • 35
  • while I like this as a solution it didn't fix my problem – Peter May 20 '22 at 00:46
  • 1
    @Peter i think you may need to debug which "name" if failed to read by by `OpenpyxlReader archive.read` and add it to `name IN (openpyxl_xml_constants.ARC_STYLE, YOUR_STYLE)` – pymen May 20 '22 at 07:19
  • I don't understand what you are suggesting – Peter May 23 '22 at 22:45
1

I was able to get around the error after removing the 15th+ little "comment boxes" on the sheet.

Iniq
  • 13
  • 2
  • That is interesting. I don't have any comment boxes myself, but i's possible that I have something else which is lumped together with the comment boxes in self.max. I will do some digging. Thanks for your answer! – Emil Zaharia-Kezdi Jun 21 '18 at 18:11
1

It is an excel file generated WPS but not MS office.

  1. you can use xlwings to open it.
  2. you can save to CSV file manually and read.
user1211
  • 1,507
  • 1
  • 18
  • 27
Jing Cao
  • 11
  • 1
1

Issue is resolved if you suppress/commentout the exception like shown below in openpyxl:

def __set__(self, instance, value):
        if ((self.allow_none and value is not None)
            or not self.allow_none):
            value = _convert(self.expected_type, value)
            if value > self.max:
                self.max=self.max
                #raise ValueError('Max value is {0}'.format(self.max))
        super(Max, self).__set__(instance, value)

It resolved the issue and now I am able to use

pd.read_excel(io.BytesIO(obj['Body'].read()), engine='openpyxl', sheet_name=[0], header=None)
0

the number of comment boxed didn't solve my problem. I had to remove some worksheets until I got below 14 worksheets in total to be able to open/read the document.

0

I had the same issue — unsupported font in the file.

I just opened the file in MS Excel, reset the fonts to a basic Arial and saved a new file. And Openpyxl accepted the new file.

Nitin Nain
  • 5,113
  • 1
  • 37
  • 51
0

If you more than 14 files in a workbook this error will appear. So, try to paste the content of excel into a new file and then try to load it. This resolved my issue. Hope it resolves yours as well.

  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – xlmaster Feb 23 '23 at 18:46
-1

Just comment out the line of code in openpyxl which raises the error.