Why does Pandas.DataFrame.iloc convert to numpy.float64 and round?

Question

Take this number as an example:

1.64847910404205

If I create a Pandas DataFrame with a row and this value:

df = pd.DataFrame([{'id': 77, 'data': 1.64847910404205}])

and then iterate over the rows (Okay... the 'row') and inspect:

for index, row in df.iterrows():
    if index > 0:
        previous_row = df.iloc[index]

Of course the above is weird: why would I iterate over the rows just to pull the same row from the DF? Forget that; I removed the -1 to illustrate.

Now, if I use SciView (part of IntelliJ) and the data tab to inspect the rows individually, I get this:

row
data: 1.64847910404205

previous_row
data: 1.64847910404

Notice that previous_row has been rounded. It's because they are for some reason different data types...

row: 
type(row) #float64

previous_row:
type(previous_row) #numpy.float64

I'm curious to know: why does iloc convert to a numpy.float64 and how can I prevent it from doing so?

I need the same level of precision as I will later be doing Peak Signal to Noise Ratio (PSNR) calculations. Of course, I could just convert the float to a numpy.float64, but I don't want to lose precision.

It might just be the way it's displayed. What does `row == previous_row` return? — busybear, Dec 12 '18 at 21:46
@busybear Oh, good call. It does show as being equal. Why would they display differently?The data types are different: is it just the `labels` which are different, while the actual `data` is the same? — pookie, Dec 12 '18 at 21:48
Python doesn't have a `float64` builtin object (just `float`), so I don't think they are actually different data types. Perhaps `numpy.float64` was imported as `float64` somewhere. Just speculating. — busybear, Dec 12 '18 at 21:58
Related: [Numpy float64 vs Python float](https://stackoverflow.com/questions/27098529/numpy-float64-vs-python-float) — jpp, Dec 12 '18 at 23:12

score 2 · Answer 1 · answered Dec 12 '18 at 23:09

The type of the 'data' column in your dataframe is numpy.float64, even if Pandas only reports it as float64. You can prove this to yourself with the following:

df['data'].dtype.type is numpy.float64

which will return True. An alternative form would be:

type(df['data'].values[0]) is numpy.float64

which will also return True.

Any difference in display is down to how SciView is interpreting your code.

Why does Pandas.DataFrame.iloc convert to numpy.float64 and round?

1 Answers1