Avro schema looks like this:
{
"type" : "record",
"name" : "name1",
"fields" :
[
{
"name" : "f1",
"type" : "string"
},
{
"name" : "f2",
"type" :
{
"type" : "array",
"items" :
{
"type" : "record",
"name" : "name2",
"fields" :
[
{
"name" : "time",
"type" : [ "float", "int", "double", "long" ]
},
]
}
}
}
]
}
After reading it in Pig:
grunt> A = load 'data' using AvroStorage();
grunt> DESCRIBE A;
A: {f1: chararray,f2: {ARRAY_ELEM: (time: (FLOAT: float,INT: int,DOUBLE: double,LONG: long))}}
What I want is probably a bag of (f1:chararray, timestamp:double)
. This is what I did:
grunt> B = FOREACH A GENERATE f1, f2.time AS timestamp;
grunt> DESCRIBE B;
B: {f1: chararray,timestamp: {(time: (FLOAT: float,INT: int,DOUBLE: double,LONG: long))}}
So how do I flatten this record?
I'm new to Pig, Avro and don't know what I'm trying to do even makes sense. Thanks for your help.