6

I have two csv file and all the numeric fields are int, no decimal. When I use pandas merge function to join two dataframe, I found the int fields in one dataframe all became decimal, why that happens?

int becomes decimal

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
Hong
  • 95
  • 1
  • 9

1 Answers1

12

Each column of a DataFrame has a dtype. The dtype controls what kinds of values can be contained in that column. Columns with integer dtypes, unsurprisingly, can contain only integers. Columns with floating point dtypes contain only floats -- and NaN is a float:

In [191]: isinstance(np.nan, float)
Out[191]: True

So even though age and score are integer-valued columns, since the merged age_y and score_y columns contain NaN, the dtype must be upgraded to a floating point dtype to accommodate the NaN.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677