8

I have a sample input as tab separated key, value pair as follows

B_1001@2012-06-15   96.73429163933419@0.5511284347710459
B_1001@2012-06-18   187.4348199976547@0.5544551559243536
B_1002@2012-09-26   745.4912066349087@0.8398570478932768
B_1002@2012-09-28   60.97117969729124@0.8500267379723409

and I am loading this file into pig and doing the following

a = load '/home/HadoopUser/Desktop/a.txt' as (key:chararray, value:chararray);

describe a;
a: {key: chararray,value: chararray}

b = foreach a generate key, flatten(STRSPLIT(value,'@',2)) as (v1:double,v2:float);
describe b;
b: {key: chararray,v1: double,v2: float}

c = group b by key;
 describe c;
c: {group: chararray,b: {key: chararray,v1: double,v2: float}}

this works till here but when I use Arthematical calculations over b.v1 I am getting ClassCastException as java.lang.String can't be casted to java.lang.Double

but describe gives no error

d = foreach c generate group,SUM(b.v1);
describe d;
d: {group: chararray,double}

when I dump d; it id giving the exception

I even tried typecasting 'b' as well

b = foreach a generate key, (tuple (double,double))STRSPLIT(value,'@',2); 

now when I describe b; Its giving an error as Cannot cast tuple with schema tuple to tuple with schema tuple({double,double})

Please help me to know why is it coming like this even describe shows correct schema.

user7337271
  • 1,662
  • 1
  • 14
  • 23
sudheer
  • 338
  • 1
  • 6
  • 17

1 Answers1

8

I have experienced this issue before as well. I can't find the bug tracker link for it right now, but when you set the type/'cast' with a statement like B = FOREACH A GENERATE key AS key: chararray it will not actually cast the type (but it will change the output of DESCRIBE). You are right that you'll have to do an explicit cast, and the docs say that you can cast a chararray to a double. Try something like:

b1 = FOREACH b GENERATE key, (double)v1, (float)v2 ;

Update: Here is the link to the bug: https://issues.apache.org/jira/browse/PIG-2315

mr2ert
  • 5,146
  • 1
  • 21
  • 32
  • 1
    Thanks a lot the above thing worked exactly. this link explained me why as well http://stackoverflow.com/questions/12213659/schema-of-flatten-operator-in-pig-latin PIG Schema. http://pig.apache.org/docs/r0.9.1/basic.html#Schemas – sudheer Sep 30 '13 at 06:42