1

I am using pandas HDFStore to store dfs which I have created from data.

store = pd.HDFStore(storeName, ...)
for file in downloaded_files:
    try:
        with gzip.open(file) as f:
            data = json.loads(f.read())
            df = json_normalize(data)   
            store.append(storekey, df, format='table', append=True)
    except TypeError:
        pass
        #File Error

I have received the error:

ValueError: Trying to store a string with len [82] in [values_block_2] column but
this column has a limit of [72]!
Consider using min_itemsize to preset the sizes on these columns

I found that it is possible to set min_itemsize for the column involved but this is not a viable solution as I do not know the max length I will encounter and all the columns which I will encounter the problem.

Is there a solution to automatically catch this exception and handle it each item it occur?

hangc
  • 4,730
  • 10
  • 33
  • 66

1 Answers1

3

I think you can do it this way:

store.append(storekey, df, format='table', append=True, min_itemsize={'Long_string_column': 200})

basically it's very similar to the following create table SQL statement:

create table df(
  id     int,
  str    varchar(200)
);

where 200 is the maximal allowed length for the str column

The following links might be very helpful:

https://www.google.com/search?q=pandas+ValueError%3A+Trying+to+store+a+string+with+len+in+column+but+min_itemsize&pws=0&gl=us&gws_rd=cr

HDFStore.append(string, DataFrame) fails when string column contents are longer than those already there

Pandas pytable: how to specify min_itemsize of the elements of a MultiIndex

Community
  • 1
  • 1
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419