Bash - How to combine duplicate rows and sum the values of a csv file?

Question

I have this csv file:

Apple,5
Apple,2
Banana,3
Banana,7
Avocado,1
Grapefruit,9

And I want to combine the duplicate rows on the left and sum the values on the right. This is the end result that I want:

Apple,7
Banana,10
Avocado,1
Grapefruit,9

I know this is more of Ms Excel stuff but I want to learn how to do it with the bash script. So far I can only sort it like this:

sort -u file.csv

I don't know how to sum the values on the right based on each fruit on the left. How do I do this ?

EDIT: The answer provided below doesn't work with my actual file so here's the complete csv file

Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Mildred@email.com,205583995140400,,1,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Martha@email.com,205486016644400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Martha@email.com,205486016644400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address
05 Jun 2018,Misty@email.com,205588935534900,,1,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address

I want the end result to be like this:

Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,,3,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Martha@email.com,205486016644400,,4,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,,3,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address

I only want to sum the Quantity row while leaving the rest as it is. How do I do it with awk?

The only diff for a CSV file would be to set input and output field separators, `awk 'BEGIN{FS=OFS=","}{a[$1]+=$2}END{for(i in a) print i,a[i]}' file.csv` — Inian, Jun 07 '18 at 08:37
What if I have 11 rows and I want to sum only the 5th row ? What is the awk command for that ? — Joe, Jun 07 '18 at 08:44
Use `$5` instead of `$2` and if you are planning to use much more of `awk` suggest finding a good online documentation for it — Inian, Jun 07 '18 at 08:47
It doesn't work, can you help any further ? I'll post my complete csv file if you'd like — Joe, Jun 07 '18 at 09:05
You should have posted the actual csv in the first place. I think you need to work it out based on the answers to those duplicate questioms — Inian, Jun 07 '18 at 09:06
I have provided my actual file. I wanted to make it simpler for people to understand my question, but it turned out the answer didn't work with my actual file. Can you help please ? — Joe, Jun 07 '18 at 09:28

Bash - How to combine duplicate rows and sum the values of a csv file?

0 Answers0