6

What I want to do is to sum values of a field in all rows in an alias. This must be simple but somehow I can't find the answer. This is probably because what I want is a scalar value while PIG handles datasets? I guess I can create a row with a field which is the sum? Please advise!

kee
  • 10,969
  • 24
  • 107
  • 168
  • 1
    I found an answer but I don't have enough reputation to answer my own question in 8 hours after posting. I will add my answer later tonight. – kee Mar 27 '12 at 23:04

1 Answers1

13

This can be achieved using a GROUP ALL to bring everything into a single group, and then the SUM function to add together all the fields:

DESCRIBE a
a: (name, age, height)

b = GROUP a ALL;
c = FOREACH b GENERATE SUM(a.age);
Chris White
  • 29,949
  • 4
  • 71
  • 93