i have some memory issues in pig.
So this is my code.
a = load 'some file';
b = load 'file2';
cond = load 'cond file';
c = union a,b;
cc = join c by $0, cond by $0;
dd = foreach cc generate $0,$1;
reduce = foreach(group dd generate by random()) generate flatten (dd);
cc = join c by $1, cond by $0;
dd = foreach cc generate $1,$2;
reduce2 = foreach(group dd generate by random()) generate flatten (dd);
final = union reduce, reduce2;
store final into 'final_output';
Will there be any issues with the code? I tried running it and testing on a small sample size and it seems fine. But i am not sure will it have any implications that i am unaware about.
Ignoring the code quality as i know that this is not a good way to write scripts or coding in general. however, this is just a one-use script.