0

So I have this problem where there's a pd.DataFrame that is made of numeric values in strings. I have the following DataFrame:

import pandas as pd

df = pd.DataFrame({"A":["1","2"], "B":["abc","bcd"]})
   A    B
0  1  abc
1  2  bcd

What I tried to do is convert the dtype of the A column using a pandas function that converts dtypes to the best possible format. So, before I used the function, I got this:

print(df.dtypes)
A    object
B    object
dtype: object

This is expected as the DataFrame was created with quotes outside the numbers. But when I tried to use df.convert_dtypes(), the result didn't change even though both values of 'A' are ints:

print(df.convert_dtypes().dtypes)
A    string
B    string
dtype: object

I was expecting:

A    Int32
B    string
dtype: object

I have also tried pandas.api.types.infer_dtype, but that also doesn't work.

What I want is a function that can change dtypes into the many possibilites (below) but do it not manually. Any help would be appreciated!

string, bytes, floating, integer, mixed-integer, mixed-integer-float, decimal, complex, categorical, boolean, datetime64, datetime, date, timedelta64, timedelta, time, period, mixed, unknown-array

Pythoneer
  • 319
  • 1
  • 16

0 Answers0