I am writing a code to compare the datatype of each value in an excel spreadsheet to a database definition spreadsheet which lists the required field/datatypes.
I am using dataframe.applymap(type) to check all of the values in the excel sheet which holds my data.
data_types = location_df.applymap(type)
The output of the above block is this:
Location_ID Location_Name AltName X_UTM Y_UTM Type_Code QA_QC Comments
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'str'>
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'str'>
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'str'>
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'str'>
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'>
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'str'>
I want to change the values in the output dataframe so that the value matches an analogous string value found in my database definition file. For example, I want to change all instances of <class 'str'> to 'TEXT'.
I've been trying to use pandas.replace to do the job, but it's not giving the desired results:
for col in data_types:
data_types[col] = data_types[col].replace("<class 'str'>", "TEXT", regex=True)
The print output does not have a changed value:
Location_ID Location_Name AltName X_UTM Y_UTM Type_Code QA_QC Comments
<class 'int'> <class 'str'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'float'> <class 'str'>
I've been able to use the above .replace method to change values in a dataframe when the values are not in the <class*> format. Is anyone able to explain why the .replace method does not work in this case, as well as the 'correct' method to manipulate the values in the data_types dataframe?