1

I am trying to replace all the fields which have "." within the field name to "_".

This is what I have:

def apply_renaming_mapping(df):
    """Given a dynamic data frame, if the field contains ., replace with _"""
    # construct renaming mapping for ApplyMapping
    mappings = list()
    # for field in df.schema.fields:
    for name, dtpye in df.dtypes:
        if '.' in name:
            mappings.append((name, dtype, name.replace('.', '_'), dtype))
    # apply mapping
    reanmed= ApplyMapping(frame=df, mappings=mappings)
    return renamed

But I think I am missing some pieces. Keep getting the following error: in relationalize_and_write renamed = apply_renaming_mapping(m_df.toDF()) File apply_renaming_mapping reanmed= ApplyMapping(frame=df, mappings=mappings) TypeError: ApplyMapping() takes no arguments During handling of the above exception, another exception occurred: Traceback (most recent call last):

What am I doing wrong here?

molly_567
  • 113
  • 3

2 Answers2

1

This is a function I use in my ETL to rename the columns in one step. The parameters are DataFrame and a dictionary like {'old_name_1':'new_name_1'}

def rename_dataframe_columns(df, old_new_column_names):
    if isinstance(old_new_column_names, dict):
        for old_name, new_name in old_new_column_names.items():
            df = df.withColumnRenamed(old_name, new_name)
        return df

    raise ValueError("'old_new_column_names' should be a dict, like {'old_name_1':'new_name_1'}")

A simple for over the df.columns is enough to create the dictionary.

fernolimits
  • 426
  • 3
  • 8
0

Answering this question:

  1. If Dynamic Frame, Convert to a data frame.
  2. Use df.columns to replace columns
  3. Convert back to the dynamic frame.
molly_567
  • 113
  • 3