0

I need to create a buffer in cascading hadoop.

Suppose i have fields :

member_id,amountpaid,diadnosis_id,diagnosis_description,superGrouper_id,superGrouper_descriptiion,grouperId,grouperDescription

I need to

  1. group the fields from member_id and superGrouper_id
  2. send these information using every pipe to a buffer
  3. the buffer output should be :member_id, highest paid sorting from superGrouper, highest paid sorting from grouperId, highest paid from diagnosis_id, along with their description...

Please help me creating a buffer. Thanks in advance

victorkt
  • 13,992
  • 9
  • 52
  • 51
Rach
  • 3
  • 1

2 Answers2

1

You don't need a custom buffer. Use the built in Max aggregator from Cascading. Cascading Docs

You then just need to run the Max after a GroupBy.

pipe = new GroupBy(pipe, new Fields("member_id", "superGrouper_id");
pipe = new Every(pipe, new Fields("amountpaid"), new Max(new Fields("max_paid"));
victorkt
  • 13,992
  • 9
  • 52
  • 51
Brian Ethier
  • 169
  • 3
0

you can do the following:

pipe = new GroupBy(pipe, new Fields("member_id", "superGrouper_id"), new Fields("superGrouper", "grouperId", "")); 
pipe = new Every(pipe, FirstNBuffer(int n));

I am sorry if I am wrong. Your question is not quite clear.

victorkt
  • 13,992
  • 9
  • 52
  • 51
pramesh
  • 1,914
  • 1
  • 19
  • 30