This is what it was asked for me to do:
Remove the dollar sign and comma from the columns. If necessary, convert these two columns to the appropriate data type.
As my dataset does not contain values with $ sign, I am removing the '." in the numbers of review for "," for the sake of the exercise
def remove_commas(value):
if pd.isna(value):
return np.NaN
else:
return float(value.replace (".", ","))
df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"
Error Message number 1:
File "/var/folders/vr/bbf8y6555gs306xzf_x7zxf80000gn/T/ipykernel_22769/1957524384.py", line 1
df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"
^
SyntaxError: EOL while scanning string literal
Error Message number 2:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3628 try:
-> 3629 return self._engine.get_loc(casted_key)
3630 except KeyError as err:
/opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
/opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'reviews per month'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
/var/folders/vr/bbf8y6555gs306xzf_x7zxf80000gn/T/ipykernel_22769/969712826.py in <module>
----> 1 df["reviews per month"]
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
3503 if self.columns.nlevels > 1:
3504 return self._getitem_multilevel(key)
-> 3505 indexer = self.columns.get_loc(key)
3506 if is_integer(indexer):
3507 indexer = [indexer]
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3629 return self._engine.get_loc(casted_key)
3630 except KeyError as err:
-> 3631 raise KeyError(key) from err
3632 except TypeError:
3633 # If we have a listlike key, _check_indexing_error will raise
KeyError: 'reviews per month'
Question: what is the issue? Could be related to the datatype?
For this header is displaying
reviews_per_month float64
def remove_commas(value):
if pd.isna(value):
return np.NaN
else:
return float(value.replace (".", ","))
df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"
I was expecting to get this change in this header of the dataset:
from "reviews_per_month: 0.20" to change to "reviews_per_month: 0,20"