I am new to Hadoop programming, looking for help in pig. I have data coming from simple.txt
format as ,
delimeter. I have two use cases. I want to do ltrim(rtrim())
on all the columns and turn to UPPER
for selected fields.
Here is my script:
party = Load '/party_test_pig.txt' USING PigStorage(',') AS(....);
Trim_party = FOREACH Upper_party GENERATE TRIM(*);
Upper_party = FOREACH party GENERATE UPPER(col1), UPPER(col2), UPPER(col3);
Upper_party:
After making it uppercase, I want to view all the columns and not only columns that get change to upper case.
Trim_party:
did some research and found out, to trim all columns I will have to write an UDF. I can do Trim_party = FOREACH Upper_party GENERATE TRIM(col1)...TRIM(coln);
but I feel this is not an efficient way and time-consuming.
Is there any other way, I could make this script work without writing UDF for Trim?
Thanks in advance.