I got strange results using NSDataDetector and I am looking for insight in how it works.
Is it matching against an internal database or is it using any separation algorithm to detect the separate fields in string?
Currently, I am using the following code to detect the fields of an address:
NSDataDetector *address = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeAddress error:nil];
NSArray* matcheslinkaa = [address matchesInString:inputString options:0 range:NSMakeRange(0, [inputString length])];
if ([matcheslinkaa count]>0)
{
for (NSTextCheckingResult *match in matcheslinkaa)
{
if ([match resultType] == NSTextCheckingTypeAddress)
{
NSDictionary *phoneNumber = [match addressComponents];
NSLog(@"addressComponents %@",phoneNumber);
}
}
}
Following is a sample set of input strings and their respective outputs, using the above code:
inputString = @"100 Main Street\n"
"Anytown, NY 12345\n"
"USA";
// prints:
// addressComponents {
// City = Anytown;
// Country = USA;
// State = NY;
// Street = "100 Main Street";
// ZIP = 12345;
// }
inputString = @"A-205 Natasha Golf View\n"
"2 Inner Ring Road\n"
"Bangalore\n"
"560071\n"
"Karnataka";
// prints:
// addressComponents {
// City = Bangalore;
// Street = "2 Inner Ring Road";
// ZIP = 560071;
// }
inputString = @"A-205 Natasha Golf View\n"
"2 Inner Ring Road\n"
"Domlur\n"
"Bangalore\n"
"560071\n"
"India";
// prints:
// addressComponents {
// City = Bangalore;
// Street = "2 Inner Ring Road";
// ZIP = 560071;
// }
inputString = @"Dak Bhavan\n"
"Parliament Street\n"
"NEW DELHI 110001\n"
"INDIA";
// => `addressComponents` is empty!
As you can see, NSDataDetector has no problem to extract US-addresses. Why is it faring so much worse with Indian addresses that it doesn't even find the country name?