12

I having replace issue while I try to replace a string with value from another column. I want to replace 'Length' with df['Length'].

df["Length"]= df["Length"].replace('Length', df['Length'], regex = True)

Below is my data

Input:
**Formula**  **Length**
Length           5
Length+1.5       6
Length-2.5       5
Length           4
5                5

Expected Output:
**Formula**  **Length**
5                5
6+1.5            6
5-2.5            5
4                4
5                5

However, with the code I used above, it will replace my entire cell instead of Length only. I getting below output: I found it was due to df['column'] is used, if I used any other string the behind offset (-1.5) will not get replaced.

**Formula**  **Length**
5                5
6                6
5                5
4                4
5                5

May I know is there any replace method for values from other columns?

Thank you.

sammywemmy
  • 27,093
  • 4
  • 17
  • 31
Js _ lfzr
  • 141
  • 1
  • 8

2 Answers2

17

If want replace by another column is necessary use DataFrame.apply:

df["Formula"]= df.apply(lambda x: x['Formula'].replace('Length', str(x['Length'])), axis=1)
print (df)
  Formula  Length
0       5       5
1   6+1.5       6
2   5-2.5       5
3       4       4
4       5       5

Or list comprehension:

df["Formula"]= [x.replace('Length', str(y)) for x, y  in df[['Formula','Length']].to_numpy()]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks jezrael :) It is working good! Thanks for introduce .apply with me and also the alternative solution. I am really appreciate it :) – Js _ lfzr Jul 20 '20 at 07:10
0

Just wanted to add, that list comprehension is much faster of course:

df = pd.DataFrame({'a': ['aba'] * 1000000, 'c': ['c'] * 1000000})

%timeit df.apply(lambda x: x['a'].replace('b', x['c']), axis=1)
# 1 loop, best of 5: 11.8 s per loop

%timeit [x.replace('b', str(y)) for x, y in df[['a', 'c']].to_numpy()]
# 1 loop, best of 5: 1.3 s per loop
pdaawr
  • 436
  • 7
  • 16