0

I have a pig script which sends a tuple which in turn contains a bag of tuples to a Java UDF. In the UDF, I read the tuple by alias using AliasableEvalFunc. I'm able to read the bag by its alias but not the tuples within the bag by their alias. For eg: Lets say pig sends this to UDF:

data = load 'input' using PigStorage(',') as (title:chararray,entities:bag{tuple:(entityName:chararray)});
data = foreach data generate udf(title,entities);

The file that contains sample data looks like this:

ThisIsTitle,{(SampleName)}

This is my UDF:

class Udf extends AliasableEvalFunc<Tuple> {
    public Tuple exec(Tuple input) {
        String title = getString(input, "title"); //works
        DataBag entities = getBag(input, "entities"); //works
        for (Tuple entity : entities) {
            String name = getString(entity, "entityName"); // this throws an exception
        }
    }
}

Essentially what is happening is, I'm able to reference aliases on the first level only. For anything that is nested, I am unable to call get by alias. Is this expected or am I doing something wrong?

coder
  • 1,901
  • 5
  • 29
  • 44

1 Answers1

0

I found that we need to use getPrefixedAliasName method to get the alias of inner tuples. The AliasableEvalFunc.java file has an example about this.

coder
  • 1,901
  • 5
  • 29
  • 44