You were in the right direction, but there is one thing of the STRSPLIT
you didn't notice. You can use it also when the number of splits is not fixed. The third argument for that UDF is the number of 'splits' you have, but you can pass a negative number and it will look for all the possible splits that match your expression.
From the official documentation for STRSPLIT:
limit
If the value is positive, the pattern (the compiled representation of the regular expression) is applied at most limit-1 times, therefore the value of the argument means the maximum length of the result tuple. The last element of the result tuple will contain all input after the last match.
If the value is negative, no limit is applied for the length of the result tuple.
Imagine this input:
abc|def|xyz,1
abc|def|xyz|abc|def|xyz,2
You can do the following:
A = load 'data.txt' using PigStorage(',');
B = foreach A generate STRSPLIT($0,'\\|',-1);
And the output will be:
DUMP B;
((abc,def,xyz))
((abc,def,xyz,abc,def,xyz))