0

Given the name "David" presented in three different ways ("DAVID david David"), CoreNLP is only able to mark #1 and #2 as MALE despite the fact that #3 is the only one marked as a PERSON. I'm using the standard model provided originally and I attempted to implement the suggestions listed here but 'gender' is not allowed before NER anymore. My test is below with the same results in both Java and Jython (Word, Gender, NER Tag):

DAVID, MALE, O
david, MALE, O
David, None, PERSON
Cameron M.
  • 21
  • 7
  • Hi this looks broken to me. I am going to review the GenderAnnotator and make some fixes to resolve this. I'll let you know when the new version is submitted to GitHub. – StanfordNLPHelp Sep 25 '17 at 05:40

1 Answers1

1

This is a bug in Stanford CoreNLP 3.8.0.

I have made some modifications to the GenderAnnotator and submitted them. They are available now on GitHub. I am still working on this, so probably over the next day or so there will be further changes, but I think this bug is fixed now. You will also need the latest version of the models jar which was just updated that contains the name lists. I believe shortly I will build another models jar with larger name lists.

The new version of GenderAnnotator requires the entitymentions annotator to be used. Also, the new version logs the gender of both the CoreMap for the entity mention and for each token of the entity mention.

You can learn how to work with the latest version of Stanford CoreNLP off of GitHub here: https://stanfordnlp.github.io/CoreNLP/download.html

StanfordNLPHelp
  • 8,699
  • 1
  • 11
  • 9
  • Thanks for your help as the changes work perfectly but I have another question. With the first names moved into three different files, how can I provide my own database file for both genders? – Cameron M. Sep 26 '17 at 22:08
  • I will change the code to allow for you to submit your own name lists. – StanfordNLPHelp Sep 27 '17 at 05:17
  • In about 10-20 minutes the change will be on GitHub. You need to use the "gender.maleNamesFile" and "gender.femaleNamesFile" properties. – StanfordNLPHelp Sep 27 '17 at 05:21