Pandas dataframes are a collection of series objects, so you can't have more than one data type specified per column (i.e. a column with [2, 'dog', 3]
will have the dtype
object
because of the string likewise [2, 2.5, 3]
can't be type int
because of the 2.5
.
If you want to work row-based you'll need to transpose your DataFrame
usingdf.transpose()
(or shorthand df.T
) this will make your columns become rows. If you're importing your data you can transpose your dataframe and cast to each column to the data type you want, if it's the case that you're preparing data to be exported then at your last step before exporting transpose.
Eg:
import pandas as pd
df = pd.DataFrame({'col_1': [1, 'cat', 3],
'col_2': [4, 'dog', 6]},
index=['row_1', 'row_2', 'row_3'])
>>> df
col_1 col_2
row_1 1 4
row_2 cat dog
row_3 3 6
# Due to the the strings both columns are dtype object
>>> df.dtypes
col_1 object
col_2 object
# Transpose the df
>>> df.T
row_1 row_2 row_3
col_1 1 cat 3
col_2 4 dog 6
# Now our data is in columns but still dtype object
>>> df.T.dtypes
row_1 object
row_2 object
row_3 object
# We can cast our columns (originally rows) to new dtypes now
>>> df.T.astype({'row_1': 'int', 'row_2': str, 'row_3': 'int'})
row_1 row_2 row_3
col_1 1 cat 3
col_2 4 dog 6
>>> df.T.astype({'row_1': 'int', 'row_2': str, 'row_3': 'int'}).dtypes
row_1 int64
row_2 object
row_3 int64