I'd like to check if all values have the same types as in the first row. Somehow df.applymap and series.apply don't behave like I would have assumed.
The dataset is from the imdb sentiment analysis on kaggle.
print(df.head())
id sentiment review
0 "5814_8" 1 "With all this stuff going down at the moment ...
1 "2381_9" 1 "\"The Classic War of the Worlds\" by Timothy ...
2 "7759_3" 0 "The film starts with a manager (Nicholas Bell...
3 "3630_4" 0 "It must be assumed that those who praised thi...
4 "9495_8" 1 "Superbly trashy and wondrously unpretentious ...
Each row seems to be str,int,str. So everything seems to be fine.
print(df.applymap(type))
id sentiment review
0 <class 'str'> <class 'int'> <class 'str'>
1 <class 'str'> <class 'int'> <class 'str'>
2 <class 'str'> <class 'int'> <class 'str'>
3 <class 'str'> <class 'int'> <class 'str'>
4 <class 'str'> <class 'int'> <class 'str'>
Calling apply on the series looks a little bit different. The sentiment is int64 instead of int.
print(df.iloc[0].apply(type))
id <class 'str'>
sentiment <class 'numpy.int64'>
review <class 'str'>
Name: 0, dtype: object
Maybe its the same anyways so I compared the types.
print(df.applymap(type) == df.iloc[0].apply(type))
id sentiment review
0 True False True
1 True False True
2 True False True
3 True False True
4 True False True
The result is unexpected. At least the first line should be True,True,True. I use applymap on a DataFrame which should be element wise. The second apply is on a series, which should also be element wise. So why are the results not equal?