I apologize in advance if this has been covered, I could not find anything quite like this. This is my first programming job (I was previously software QA) and I've been beating my head against a wall on this.
I have 2 dataframes, one is very large [df2] (14.6 million lines) and I am iterating through it in chunks. I attempted to compare a column of the same name in each dataframe, if they're equal I would like to output a secondary column of the larger frame.
i.e.
if df1['tag'] == df2['tag']:
df1['new column'] = df2['plate']
I attempted a merge but this didn't output what I expected.
df3 = pd.merge(df1, df2, on='tag', how='left')
I hope I did an okay job explaining this.
[Edit:] I also believe I should mention that df2 and df1 both have many additional columns I do not want to interact with/change. Is it possible to only compare the single columns of two dataframes, and output the third additional column?