I have a dictionary of abbreviations of some city names that our system (for some reason) applies to data (i.e. 'Kansas City' is abbreviated 'Kansas CY', and Oklahoma City is spelled correctly).
I am having an issue getting my function to apply to the column of the dataframe but it works when I pass in strings of data. Code sample below:
def multiple_replace(text, dict):
# Create a regular expression from the dictionary keys
regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))
# For each match, look-up corresponding value in dictionary
return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text)
testDict = {"Kansas CY": "Kansas City"}
dfData['PREV_CITY'] = dfData['PREV_CITY'].apply(multiple_replace, dict=testDict)
When I add 'axis=1' into that last line it errors out saying I provided too many args. Otherwise, it runs without error it just doesn't make the changes when there is a match to the dictionary.
Thank you in advance! -Reece