I need a help in writing in U-SQL to output records to two different files based on a regular expression output. Let me explain my scenario in detail.
Let us assume my input file has two columns, "Name" and person identification number ("PIN"):
Name , PIN
John ,12345
Harry ,01234
Tom, 24659
My condition for PIN is it should start with either 1 or 2. In the above case records 1 & 3 are valid and record 2 is invalid.
I need to output record 1 & 3 to my output processed file and 2 to my error file
How can I do this and also can I use Regex.Match
to validate the regular expression?
//posting my code
@person =
EXTRACT UserId int,
PNR string,
UID String,
FROM "/Samples/Data/person.csv"
USING Extractors.csv();
@rs1=select UserId,PNR,UID,Regex.match(PNR,'^(19|20)[0-9]{2}((0[1-9])$') as pnrval,Regex.match(UID,'^(19|20)[0-9]{2}$') as uidval
from @person
@rs2 = select UserId,PNR,UID from @rs1 where pnrval=true or uidval=true
@rs3 = select UserId,PNR,UID from @rs1 where uidval=false or uidval= false
OUTPUT @rs2
TO "/output/sl.csv"
USING Outputters.Csv();
OUTPUT @rs3
TO "/output/error.csv"
USING Outputters.Csv();
But I'm receiving this error:
Severity Code Description Project File Line Suppression State Error E_CSC_USER_INVALIDCOLUMNTYPE: 'System.Text.RegularExpressions.Match' cannot be used as column type.