This is an extended question to this post.
I'm working on a large dataset and it seems to be very messy containing random strings here and there with no order. I would like to check on these values column wise without printing the numeric values of those columns. For example if df.Col
is a float
object but has strings such as #$, ^%#, etc., is there a way to view/print only those?
EDIT:
For example I have a column Client_Income
.
print(str(df["Client_Income"].name).upper())
print("No. of unique values:", df["Client_Income"].nunique())
print(df["Client_Income"].unique())
CLIENT_INCOME
No. of unique values: 1217
['6750' '20250' '18000' ... '13140' '9764.1' '12840.75']
While this shows that the column could be purely float
deliberately converted to object
, when I do .astype("float16")
for example, I get the error Cannot convert to float: '$'
. There are other columns that contains values such as 'XNA', '##', etc.