This below program I am trying to do it in Apache Pig as it is and unstructured data
i) I have dataset which contains street name, city and state:
ii) Group by state
iii) I am taking COUNT(*) of states in the dataset Now my o/p will be like statename,count===>how may time that state is available in the dataset
program:
realestate = LOAD DATA using pigstorage(',') as (street:string,city string,state string);
A = GROUP realestate by state;
B= FOREACH A GENERATE group , count (*)
O/P will be like
CA,14
washington,20
now I need max of (count) my output should be " washington,20)
how to proceed it .please help me to resolve the issue