0

I am writing a code to compare the datatype of each value in an excel spreadsheet to a database definition spreadsheet which lists the required field/datatypes.

I am using dataframe.applymap(type) to check all of the values in the excel sheet which holds my data.

data_types = location_df.applymap(type)

The output of the above block is this:

  Location_ID    Location_Name      AltName          X_UTM            Y_UTM           Type_Code            QA_QC         Comments
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>

I want to change the values in the output dataframe so that the value matches an analogous string value found in my database definition file. For example, I want to change all instances of <class 'str'> to 'TEXT'.

I've been trying to use pandas.replace to do the job, but it's not giving the desired results:

for col in data_types:
     data_types[col] = data_types[col].replace("<class 'str'>", "TEXT", regex=True)

The print output does not have a changed value:

      Location_ID    Location_Name      AltName          X_UTM            Y_UTM           Type_Code            QA_QC         Comments
     <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>

I've been able to use the above .replace method to change values in a dataframe when the values are not in the <class*> format. Is anyone able to explain why the .replace method does not work in this case, as well as the 'correct' method to manipulate the values in the data_types dataframe?

  • Use `astype(str)` before? `data_types.astype(str).replace("", "TEXT", regex=False)` – Corralien Jul 04 '22 at 14:23
  • Umm... can you do something like: `df.applymap(lambda v: type(v).__name__).apply(pd.value_counts)` and then rename the index accordingly? – Jon Clements Jul 04 '22 at 14:30
  • Try this out, turn the type apparence in to `str` , then you can replace the text,`data_types.applymap(type).applymap(lambda x:str(x)).applymap(lambda x: x.replace("", 'TEXT'))` – Baron Legendre Jul 04 '22 at 19:27

0 Answers0