3

I have a Dataframe full of french words, endings and new endings. I want to create a 4th column with the alternative to the word as such:

word   |ending|new ending|what i want|
--------------------------------------
placer |cer   |ceras     |placeras   |
placer |cer   |cerait    |placerait  |
placer |cer   |ceront    |placeront  |
finir  |ir    |iras      |finiras    |

So it's basically to replace, in column 1, what's equivalent in column 2, by what I have in column 3.

Any ideas ?

jpp
  • 159,742
  • 34
  • 281
  • 339
NTiberio
  • 95
  • 7

3 Answers3

2

Here is one way using .loc accessor:

import pandas as pd

df = pd.DataFrame({'word': ['placer', 'placer', 'placer'],
                   'ending': ['cer', 'cer', 'cer'],
                   'new_ending': ['ceras', 'cerait', 'ceront']})

df['result'] = df['word']
df['lens'] = df['ending'].map(len)

df.loc[pd.Series([i[-j:] for i, j in zip(df['word'], df['lens'])]) == df['ending'], 'result'] = \
pd.Series([i[:-j] for i, j in zip(df['word'], df['lens'])]) + df['new_ending']

df = df[['word', 'ending', 'new_ending', 'result']]

#      word ending new_ending     result
# 0  placer    cer      ceras   placeras
# 1  placer    cer     cerait  placerait
# 2  placer    cer     ceront  placeront
jpp
  • 159,742
  • 34
  • 281
  • 339
  • 1
    I notice you use -3, but it's entirely possible the length of the ending isn't (I have edited my question to reflect that). So I want the solution to work for everything at once. – NTiberio Mar 08 '18 at 20:21
2

Using apply():

df['new_word'] = df.apply(
    lambda row: row['word'].replace(row['ending'], row['new ending']),
    axis=1
)
#     word ending new ending   new_word
#0  placer    cer      ceras   placeras
#1  placer    cer     cerait  placerait
#2  placer    cer     ceront  placeront
#3   finir     ir       iras    finiras

As @jpp pointed out, a caveat to this approach is that it won't work correctly if the ending is present in the middle of the string.

In that case, refer to this post on how to replace at the end of the string.

pault
  • 41,343
  • 15
  • 107
  • 149
2

here is another solution :

df.word.replace(df.ending, '', regex=True).str.cat(df["new ending"].astype(str))

and the output :

0     placeras
1    placerait
2    placeront
Espoir Murhabazi
  • 5,973
  • 5
  • 42
  • 73