I have a big data set of sales data like this:
Ordernumber Category Sold_Items
1 123 A 2
2 123 B 1
3 234 C 1
4 345 D 1
5 456 A 2
6 456 B 1
And I want to aggregate it to this:
A B C D frequency
2 1 2
1 1
1 1
So, basically I want to have one row for every category-combination which exists in my sales data. And I also want to know how frequent one combination is.
Well, I tried it with cast
and melt
and they get me there, but only when the data set is small enough. Unfortunately, I have like >3 million rows of data which cast
and melt
cannot handle any more.
Can anyone tell me how to aggregate my data in a fast way?
Thank you in advance!