I have say two Pig variables,
p which is (id: int, companies: tuple(name:chararray))
and
q which is (id: int, company: chararray)
.
Now after I join p and q by their "id"'s, how do I filter out those rows where q::company
is not present in p::companies
?
PS I went through this question Check if an element is present in a bag? but it seems to be not exactly as my problem.
Example
sample p
1,(c1 c2 c3)
2,(c4 c5 c6)
3,(c2 c3 c5)
sample q
1,c3
2,c8
3,c5
expected output after the joins
1,c3
3,c5