5

Pandas aggregation functions return TypeError: Object of type int64 is not JSON serializable.

Here is the dataframe:

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
df
Out[47]: 
   col1  col2
0     1     3
1     2     4

Here is how I am aggregating the column:

sum_col = df.col1.sum()
sum_col
Out[49]: 3

But as soon as I do json.dumps() it gives a type error:

data = json.dumps(sum_col)
data
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-50-5d4b966e64cc> in <module>
----> 1 data = json.dumps(sum_col)
      2 data
TypeError: Object of type int64 is not JSON serializable

ivanleoncz
  • 9,070
  • 7
  • 57
  • 49
Shilp Thapak
  • 331
  • 6
  • 14

1 Answers1

2

I resolved this one.

Pandas aggregation functions (like sum, count and mean) returns a NumPy int64 type number, not a Python integer. Although it looks exactly like a python integer.

d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
sum_col = df.col1.sum()
type(sum_col)

Out[52]: numpy.int64

This can be fixed by using python's int() function.

sum_col = int(df.col1.sum())
data = json.dumps(sum_col)
data

Out[56]: '3'
Shilp Thapak
  • 331
  • 6
  • 14
  • 1
    seems `sum_col.item()` is a better choice [ref](https://stackoverflow.com/a/11389998/1518100) – Lei Yang Mar 29 '22 at 05:50
  • For the sake of good information, here's the oficial statement from the Pandas documentations regarding `int64`: https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#defaults – ivanleoncz May 10 '23 at 14:17