4

I have two Dataframes df and df1.

Main DataFrame is as follows:
DF:

    start   end price
0   A   Z   1
1   B   Y   2
2   C   X   3
3   A   Z   4
4   D   W   5

Second DataFrame:
DF1:

start   end price
    0   A   Z   100
    1   B   Y   200

I want the main dataframe df to update the values in 'price' columns based on the start and end in df1. it should update column value for all the rows having the same start and end as in df1. DF:

start   end price
0   A   Z   100
1   B   Y   200
2   C   X   3
3   A   Z   100
4   D   W   5

(all A-Z and B-Y in df should get updated). Is there anyway I can get this output ? In reality the datframes have more columns but I want to update only one column(eg.'Price').

niraj
  • 17,498
  • 4
  • 33
  • 48
AISH
  • 61
  • 1
  • 5

3 Answers3

2

First, you can merge:

s = df1.merge(df2, left_on=['start', 'end'], right_on=['start', 'end'], how='left')

Then you can fillna and index your desired columns:

s.assign(price=s.price_y.fillna(s.price_x))[['start', 'end', 'price']]

  start end  price
0     A   Z  100.0
1     B   Y  200.0
2     C   X    3.0
3     A   Z  100.0
4     D   W    5.0
user3483203
  • 50,081
  • 9
  • 65
  • 94
  • thaks. I have multiple columns in the main dataframe df (except start and end), I want that too in the result. how can I modify your code to get all columns from the main dataframe df. – AISH Sep 21 '18 at 06:51
2

Using update

df=df.set_index(['start','end'])
df.update(df1.set_index(['start','end']))
df.reset_index()
Out[99]: 
  start end  price
0     A   Z  100.0
1     B   Y  200.0
2     C   X    3.0
3     A   Z  100.0
4     D   W    5.0
BENY
  • 317,841
  • 20
  • 164
  • 234
1

merge

df.drop('price', 1).merge(df1, 'left').fillna(df)

  start end  price
0     A   Z  100.0
1     B   Y  200.0
2     C   X    3.0
3     A   Z  100.0
4     D   W    5.0

  1. I'm going to merge on ['start', 'end'] and that pesky price is going to get in my way. So, I drop it.
  2. I need to preserve df index because I have that repeat of 'A' and 'Z'. So, I use a 'left' merge
  3. Now my missing elements can be filled back in with df
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • 1
    Very nice, much cleaner – user3483203 Sep 21 '18 at 05:37
  • It gives me the exact output, but when I run it in a loop( in my code df1 keeps changing and makes updates to the mail dataframe df ), it gave me error saying " TypeError: unhashable type: 'dict' ". – AISH Sep 21 '18 at 07:16
  • Loop? What? Why? I have no idea what you are doing in your loop. I’d suggest accepting one of these answers and asking a new question that addresses the loop issue. – piRSquared Sep 21 '18 at 10:34