1

Apple provides an for using the tagger to identify named entities in Swift but not for Objective-C.

Here is the Swift example they provide:

let text = "The American Red Cross was established in Washington, D.C., by Clara Barton."
let tagger = NSLinguisticTagger(tagSchemes: [.nameType], options: 0)
tagger.string = text
let range = NSRange(location:0, length: text.utf16.count)
let options: NSLinguisticTagger.Options = [.omitPunctuation, .omitWhitespace, .joinNames]
let tags: [NSLinguisticTag] = [.personalName, .placeName, .organizationName]
tagger.enumerateTags(in: range, unit: .word, scheme: .nameType, options: options) { tag, tokenRange, stop in
    if let tag = tag, tags.contains(tag) {
        let name = (text as NSString).substring(with: tokenRange)
        print("\(name): \(tag)")
    }
}

I have gotten this far with translating with help from here it but I can't figure out how to spefify tags, e.g. [.personalName, .placeName, .organizationName]: Is that just an array of tag types though which you enumerate?

NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc]
                              initWithTagSchemes:[NSArray arrayWithObjects:NSLinguisticTagSchemeNameType, nil]
                              options:(NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation | NSLinguisticTaggerJoinNames)];
[tagger setString:text];

[tagger enumerateTagsInRange:NSMakeRange(0, [text length])
                      scheme:NSLinguisticTagSchemeNameType
                     options:(NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation| NSLinguisticTaggerJoinNames)
                  usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
                      NSString *token = [text substringWithRange:tokenRange];
                      NSString *name =[tagger tagAtIndex:tokenRange.location scheme:NSLinguisticTagSchemeNameType tokenRange:NULL sentenceRange:NULL];

                      if (name == nil) {
                          name = token;
                      }

                      NSLog(@"tagger results:%@, %@", token, name);
                  }];

Thanks for any suggestions on how to specify tags in Objective-C.

user6631314
  • 1,751
  • 1
  • 13
  • 44
  • 1
    `NSArray *tags = [NSLinguisticTagPersonalName, ...]`... `if (tag && [tags contains:tag]) { NSString *name = [text substringWithRange:tokenRange]; NSLog("%@: %@", name, tag);}`? – Larme Sep 11 '18 at 19:54
  • Ok, I guess the tags are just an array to compare against, not a filter you can place into the tagger. – user6631314 Sep 11 '18 at 20:23
  • I don’t know that API, but that what seems to do the Swift code. – Larme Sep 11 '18 at 20:56

1 Answers1

0

Original Swift code:

let text = "The American Red Cross was established in Washington, D.C., by Clara Barton."
let tagger = NSLinguisticTagger(tagSchemes: [.nameType], options: 0)
tagger.string = text
let range = NSRange(location:0, length: text.utf16.count)
let options: NSLinguisticTagger.Options = [.omitPunctuation, .omitWhitespace, .joinNames]
let tags: [NSLinguisticTag] = [.personalName, .placeName, .organizationName]
tagger.enumerateTags(in: range, unit: .word, scheme: .nameType, options: options) { tag, tokenRange, stop in
    if let tag = tag, tags.contains(tag) {
        let name = (text as NSString).substring(with: tokenRange)
        print("\(name): \(tag)")
    }
}

Output:

American Red Cross: NSLinguisticTag(_rawValue: OrganizationName)
Washington: NSLinguisticTag(_rawValue: PlaceName)
Clara Barton: NSLinguisticTag(_rawValue: PersonalName)

Objective-C version:

NSString* text = @"The American Red Cross was established in Washington, D.C., by Clara Barton.";
NSLinguisticTagger* tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:@[NSLinguisticTagSchemeNameType] options:0];
tagger.string = text;
NSRange range = NSMakeRange(0, text.length);
NSLinguisticTaggerOptions options = NSLinguisticTaggerOmitPunctuation | NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerJoinNames;
NSArray* tags = @[NSLinguisticTagPersonalName, NSLinguisticTagPlaceName, NSLinguisticTagOrganizationName];
[tagger enumerateTagsInRange:range unit:NSLinguisticTaggerUnitWord scheme:NSLinguisticTagSchemeNameType options:options usingBlock:^(NSLinguisticTag  _Nullable tag, NSRange tokenRange, BOOL * _Nonnull stop) {
    if ([tags containsObject:tag]) {
        NSString* name = [text substringWithRange:tokenRange];
        NSLog(@"%@: %@", name, tag);
    }
}];

Output:

2018-09-12 09:51:00.323378-0700 App[2408:109005] American Red Cross: OrganizationName
2018-09-12 09:51:00.323755-0700 App[2408:109005] Washington: PlaceName
2018-09-12 09:51:00.323901-0700 App[2408:109005] Clara Barton: PersonalName
matt
  • 515,959
  • 87
  • 875
  • 1,141