What I want to do is to sum values of a field in all rows in an alias. This must be simple but somehow I can't find the answer. This is probably because what I want is a scalar value while PIG handles datasets? I guess I can create a row with a field which is the sum? Please advise!
Asked
Active
Viewed 8,589 times
6
-
1I found an answer but I don't have enough reputation to answer my own question in 8 hours after posting. I will add my answer later tonight. – kee Mar 27 '12 at 23:04
1 Answers
13
This can be achieved using a GROUP ALL to bring everything into a single group, and then the SUM function to add together all the fields:
DESCRIBE a
a: (name, age, height)
b = GROUP a ALL;
c = FOREACH b GENERATE SUM(a.age);

Chris White
- 29,949
- 4
- 71
- 93