1

I'm trying to add columns to the dataframe that are booleans based on a determination of whether the current column being iterated through is alphanumeric, alphabetical, or numerical. Unfortunately, each of the columns are giving False for each of the boolean tests. The goal is that for a given column, how can I add another column that shows whether the row in the given column is alphanumeric? I don't want to iterate through each row in the column since that is very time consuming. I need to do this since there may be instances where I do not know what data type will be contained in a given column.

def add_numeric_alpha_alphanum_tests(dataframe, dataframe_column_names):
    for column_name in dataframe_column_names:
        column_name_is_alphanumeric = column_name + "_is_alphanumeric"
        data_to_test = str(dataframe[column_name].values)
        dataframe[column_name_is_alphanumeric] = np.where(data_to_test.isalnum(), True, False)
        column_name_is_alpha = column_name + "_is_alpha"
        dataframe[column_name_is_alpha] = np.where(data_to_test.isalpha(), True, False)
        column_name_is_digit = column_name + "_is_digit"
        dataframe[column_name_is_digit] = np.where(data_to_test.isdigit(), True, False)
    return dataframe
Todd Baker
  • 41
  • 1
  • 9

1 Answers1

1

You can utilize the apply function in Pandas and thus enjoy the efficiency, example:

dataframe['column1_is_alphanumeric'] = dataframe['column1'].apply(lambda x: True if str(x).isalnum() else False)
dataframe['column1_is_alpha'] = dataframe['column1'].apply(lambda x: True if str(x).isalpha() else False)
dataframe['column1_is_digit'] = dataframe['column1'].apply(lambda x: True if str(x).isdigit() else False)
Paul Lo
  • 6,032
  • 6
  • 31
  • 36
  • How does the right hand side of those assignments know which column of the dataframe is being tested? Seems that the x is a row in the dataframe, but I don't see a column designation. – Todd Baker Dec 10 '19 at 19:02
  • @ToddBaker My bad, updated the code. You can see it now specifies which column we are testing against on the righthand side : ) – Paul Lo Dec 10 '19 at 19:11