7

I have a pig job where in I need to filter the data by finding a word in it,

Here is the snippet

A = LOAD '/home/user/filename' USING PigStorage(',');
B = FOREACH A GENERATE $27,$38;
C = FILTER B BY ( $1 ==  '*Word*');
STORE C INTO '/home/user/out1' USING PigStorage();

The error is in the 3rd line while finding C, I have also tried using

C = FILTER B BY $1 MATCHES '*WORD*'  

Also

C = FILTER B BY $1 MATCHES '\\w+WORD\\w+'  
Zoe
  • 27,060
  • 21
  • 118
  • 148
learner
  • 885
  • 3
  • 14
  • 28
  • . Any character (may or may not match line terminators) * zero or more times https://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Pattern.html – isaikkimuthu Oct 01 '18 at 18:27

1 Answers1

16

MATCHES uses regular expressions. You should do ... MATCHES '.*WORD.*' instead.

These is an example here finding the word 'apache'.

user229044
  • 232,980
  • 40
  • 330
  • 338
Donald Miner
  • 38,889
  • 8
  • 95
  • 118