I have Two 5GB CSV files with 10 Columns, I need to perform update/Insert logic and generate a final CSV by comparing both CSV files.
How to do it in Python Pandas?
Ex:
If you have any alternatives solutions to do the job, let me know
I have Two 5GB CSV files with 10 Columns, I need to perform update/Insert logic and generate a final CSV by comparing both CSV files.
How to do it in Python Pandas?
Ex:
If you have any alternatives solutions to do the job, let me know
Try using the isin() method or the merge() method to compare the 2 csv files.
import pandas as pd
csv1 = pd.read_csv("file1.csv")
csv2 = pd.read_csv("file2.csv")
#comparing the data using isin()
result = csv1[csv1.apply(tuple,1).isin(csv2.apply(tuple,1))]
print(result)
#comparing the data using merge()
result2 = csv1.merge(csv2, indicator=True, how='outer').loc[lambda v : V['_merge'] != 'both']
print(result2)
To update or insert into csv files, check out the following link. Updating Values in csv files