So I have this problem where there's a pd.DataFrame
that is made of numeric values in strings.
I have the following DataFrame:
import pandas as pd
df = pd.DataFrame({"A":["1","2"], "B":["abc","bcd"]})
A B
0 1 abc
1 2 bcd
What I tried to do is convert the dtype of the A
column using a pandas function that converts dtypes to the best possible format. So, before I used the function, I got this:
print(df.dtypes)
A object
B object
dtype: object
This is expected as the DataFrame was created with quotes outside the numbers. But when I tried to use df.convert_dtypes()
, the result didn't change even though both values of 'A' are int
s:
print(df.convert_dtypes().dtypes)
A string
B string
dtype: object
I was expecting:
A Int32
B string
dtype: object
I have also tried pandas.api.types.infer_dtype
, but that also doesn't work.
What I want is a function that can change dtypes into the many possibilites (below) but do it not manually. Any help would be appreciated!
string, bytes, floating, integer, mixed-integer, mixed-integer-float, decimal, complex, categorical, boolean, datetime64, datetime, date, timedelta64, timedelta, time, period, mixed, unknown-array