I have extracted Table data from an image having multiple tables using Amazon textract and trying to map all the extracted data into a Output template CSV,
However there are multiple tables in the Extracted Input CSV file which are listed one below another. There are approx 7 tables which are listed one below other in each CSV.
Please suggest how to map the values from Input CSV to Output.
Input CSV file:
S.No Item Item_code 1st 2nd 3rd 4th Avg
1 Math_book BK001 27 36 35 23 30
2 Phy_book BJ008 30 40 40 30 35
3 Hin_book NK103 50 50 30 30 40
4 Che_book CH001 40 40 40 20 35
S.No Item_Name Item_code 1st 2nd 3rd 4th Avg
1 Math_book BK001 27 36 35 23 30
2 Phy_book BJ008 30 40 40 30 35
3 Hin_book NK103 50 50 30 30 40
S.No Product Item_code 1st 2nd 3rd 4th Avg
1 Phy_book BJ008 30 40 40 30 35
2 Hin_book NK103 50 50 30 30 40
3 Che_book CH001 40 40 40 20 35
4 Bio_book BI005 50 30 40 60 45
Expected output:
S.No Product Item_code 1st 2nd 3rd 4th
1 Math_book BK001 54 72 70 46
2 Phy_book BJ008 90 120 120 90
3 Hin_book NK103 150 150 90 90
4 Che_book CH001 80 80 80 60
5 Bio_book BI005 50 30 40 60
Code i have been trying to use :
df = pd.read_csv(r'input.csv')
df2 = pd.read_csv(r'output.csv')
How i Can add all the values by groupBy considering (Item,Item_Name,Product) columns submit the values in df2
Please Suggest