I am working a perl code base to validate customer input, my goal is to block surrogate characters.
My thought is first encoding the customer input as UTF-16 and
foreach my $messageChar (@MessageChars) {
my $messageCharUTF16 = Encode::encode("UTF-16", $messageChar);
if (($messageCharUTF16 >= 0xD800 && $messageCharUTF16 <= 0xDBFF)|( $messageCharUTF16 >= 0xDC00 && $messageCharUTF16 <= 0xDFFF)) {
// Then we have surrogate pairs
}
}
However, I am not getting the correct UTF-16 values from Encode::encode.
How can I reveal the surrogate pairs? Is there any straight-forward way to verify if a string contains surrogate characters in Perl?