I am working on exporting data from Python to an SQL database, and for performance reasons I'm trying to ensure that the data I'm exporting is registered as having the correct type. Therefore, I'm trying to create a Pandas Series of my data, having the correct data type. I assume that calling dtype on a pd.Series object yields the data of its underlying elements. I'm having trouble getting this to work for string data.
Here's a code sample demonstrating the problem:
orig_data_string = ['abc'] * 10
pd_data_string = pd.Series(orig_data_string)
pd_data_string.dtype
Running the above in a Python console yields dtype('O')
, which I take to indicate an object type. What I would like was for this to be string instead. Now, I can do something similar with numerical values:
orig_data_float = [1.23] * 10
pd_data_float = pd.Series(orig_data_float)
pd_data_float.dtype
and in this case, I get the result dtype('float64')
, so Pandas in this case has correctly inferred the data type from the list input. If I try pd.Series(orig_data_string).astype(str)
, I get the same result. How can I create a Pandas Series object with underlying data type str
from a list of strings?