pandas
deliberately uses native python strings, which require an object dtype. See pandas distinction between str and object types
Also see: https://pandas.pydata.org/docs/user_guide/text.html
df = pd.DataFrame({"A": ["a", "b", "c", "a"]})
df["B"] = df["A"].astype("category")
df["C"] = df["A"].astype("string")
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 A 4 non-null object
1 B 4 non-null category
2 C 4 non-null string
dtypes: category(1), object(1), string(1)
memory usage: 328.0+ bytes
print(df)
A B C
0 a a a
1 b b b
2 c c c
3 a a a