Case
It seems that Spoofchecker
from the Intl
extention yields false positives:
<?php // 7.0 on linux
// File encoding of this script is UTF-8 (thus without BOM)
$sDefaultLocale = (new \Locale)->getDefault();
$oSpoofchecker = new \Spoofchecker;
$oSpoofchecker->setAllowedLocales($sDefaultLocale);
$sText = 'abc'; // US-ASCII
header('Content-Type: text/plain');
print
'Default locale: ' . $sDefaultLocale . PHP_EOL
. 'Byte length: ' . strlen($sText) . PHP_EOL // US-ASCII check
. 'Text "' . $sText . '" '
. ($oSpoofchecker->isSuspicious($sText, $sError) ? 'IS' : 'IS NOT')
. ' suspicious' . PHP_EOL
. 'Spoofchecker internal error information:' . PHP_EOL;
var_dump($sError);
Results
Default locale: en_US_POSIX
Byte length: 3
Text "abc" IS suspicious
Spoofchecker internal error information:
NULL
Expected results
Text "abc" IS NOT suspicious
This is because abc is US-ASCII which assumably should be the default for en_US_POSIX
. Also PHP Spoofchecker class mentions that the return code of Spoofchecker::isSuspicious()
would be TRUE
if any non-English characters are used, which is not the case here.
Possible causes
The documentation of Spoofchecker::setAllowedLocales()
is currently close to non-existent, the argument list does not contain a list of possible values. One can only assume that it must be compatible with that of Locale
. The documentation reads:
Locales are identified using RFC 4646 language tags (which use hyphen, not underscore)
contradicts the test result where Locale
uses underscores for the default locale instead of hyphens. But when running another test with $oSpoofchecker->setAllowedLocales('en-US');
the results stay the same.
Question
How to use Spoofchecker::isSuspicious()
properly?