11

Given a word, which may or may not be a singular-form noun, how would you generate its plural form?

Based on this NLTK tutorial and this informal list on pluralization rules, I wrote this simple function:

def plural(word):
    """
    Converts a word to its plural form.
    """
    if word in c.PLURALE_TANTUMS:
        # defective nouns, fish, deer, etc
        return word
    elif word in c.IRREGULAR_NOUNS:
        # foot->feet, person->people, etc
        return c.IRREGULAR_NOUNS[word]
    elif word.endswith('fe'):
        # wolf -> wolves
        return word[:-2] + 'ves'
    elif word.endswith('f'):
        # knife -> knives
        return word[:-1] + 'ves'
    elif word.endswith('o'):
        # potato -> potatoes
        return word + 'es'
    elif word.endswith('us'):
        # cactus -> cacti
        return word[:-2] + 'i'
    elif word.endswith('on'):
        # criterion -> criteria
        return word[:-2] + 'a'
    elif word.endswith('y'):
        # community -> communities
        return word[:-1] + 'ies'
    elif word[-1] in 'sx' or word[-2:] in ['sh', 'ch']:
        return word + 'es'
    elif word.endswith('an'):
        return word[:-2] + 'en'
    else:
        return word + 's'

But I think this is incomplete. Is there a better way to do this?

Cerin
  • 60,957
  • 96
  • 316
  • 522
  • 8
    Please explain why the code you have given doesn't fulfil your needs, and what you *want* it to do. – Gareth Latty Sep 19 '13 at 18:45
  • 2
    I'm guessing the code posted is what they *want*, and they want us to supply the code that gives that output. – SethMMorton Sep 19 '13 at 18:51
  • I'm not sure this is possible without context. Did you know, sometimes the plural of "fish" is ["fishes"](http://en.wiktionary.org/wiki/fishes)? – Kevin Sep 19 '13 at 19:09
  • 1
    @Kevin: Presumably the whole reason the OP wants to use WordNet is that it has a database with that information in it, and an API to use that database. – abarnert Sep 19 '13 at 19:10
  • @Lattyware, my code is an illustration of what I'm trying to do, and what kind of I/O I'm working with. – Cerin Sep 19 '13 at 21:54
  • Ultimately, this is an [XY Problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). The best answer to "How do I use WordNet to generate plurals" is the one-word answer "Don't." Your real question should be "How do I generate plurals?", with some discussion of how you thought of WordNet and tried it but couldn't make it do what you want. – abarnert Sep 19 '13 at 22:46
  • Yikes, what an overall harsh response to an innocuous question. If you know of a reliable way to general plurals, I'm all ears. – Cerin Sep 19 '13 at 22:49
  • @Cerin: I can't speak for other people, but I can guess why people have been downvoted this question, and wrote what you're taking as harsh comments. Read [the FAQ](http://stackoverflow.com/help/asking) on asking questions, then reread your question. I can easily see why, to many people, it looks like you didn't do any research, or attempt to solve the problem yourself, before asking. And I'm not even sure they're wrong. (What made you think WordNet was the right tool for generating plurals?) But really, none of that matters. They're downvoting your question, not you as a person. – abarnert Sep 19 '13 at 23:03
  • @adarnert, I thought it was the right tool because it's the largest open source collection of linguistic data on the Internet, although its toolset is notoriously complex. I hadn't seen the Wordnet FAQ page you posted, so I didn't know it wasn't possible. However, I find it frustrating that everyone's penalizing me for not posting a solution, while no one else seems to have the slightest idea of how to solve it either. I've updated my post with my attempted solution, which is just a bunch of hard-coded rules. I didn't post it originally because I think it's a pretty bad solution. – Cerin Sep 20 '13 at 13:33
  • 1
    Again, the only answer to your question as asked is "You can't." I tried to post an answer elaborating on that, and you responded by arguing irrelevant trivialities. It's unclear how anyone could give you an answer you'd be happy with. Which is presumably why the 5 editors who closed it did so. Meanwhile, again, they're downvoting and closing your question, not you. Asking unanswerable questions doesn't make you a bad person. Even getting angry about it rather than learning how to make SO work better for you doesn't make you a bad person. Stop taking things so personally and you'll be happier. – abarnert Sep 20 '13 at 18:12
  • this question has been edited and should be opened. @Cerin - check this out: http://www.clips.ua.ac.be/pages/pattern-en#pluralization – arturomp Sep 23 '13 at 11:34
  • @amp, Thanks, great find. I've actually seen that project before, but I didn't realize it had a pluralization function. Please post that as an answer when possible. – Cerin Sep 23 '13 at 15:45

4 Answers4

35

The pattern-en package offers pluralization

>>> import pattern.en
>>> pattern.en.pluralize("dog")
'dogs'
>>> 
KetZoomer
  • 2,701
  • 3
  • 15
  • 43
arturomp
  • 28,790
  • 10
  • 43
  • 72
  • 11
    I tried this with Python 3 and it seems like it is not supported. It was fairly easy however to copy the pluralize function from https://github.com/clips/pattern/blob/master/pattern/text/en/inflect.py along with a copy of the license to get around the problem – EntilZha Jan 19 '16 at 17:14
  • 1
    I installed the pattern module as an egg for ease of distribution with my Python script, but it fails with the error `IOError: [Errno 20] Not a directory: '/path/to/eggs/Pattern-2.6-py2.7.egg/pattern/text/en/wordnet/dict/index.noun'` :-( – markshep May 10 '16 at 10:48
  • 1
    `soups` becomes `soupss` – Sergey Orshanskiy Jan 30 '17 at 18:40
  • 2
    It does have a few howlers (like quiz->quizs, but it knows quizzes->quiz). However, there is the possibility of passing a custom dict, so these can be handled without too much trauma. I think anything of this kind is bound to have a few errors - English is like that – havlock Apr 12 '19 at 13:50
  • 1
    Wow! A simple task of generating a plural form of noun comes with a price: the 'pattern' module pulls huge amount of dependencies! – Andrey Mar 17 '20 at 18:26
26

Another option which supports python 3 is Inflect.

import inflect
engine = inflect.engine()
plural = engine.plural(your_string)
alanc10n
  • 4,897
  • 7
  • 36
  • 41
5

First, it's worth noting that, as the FAQ explains, WordNet cannot generate plural forms.

If you want to use it anyway, you can. With Morphy, WordNet might be able to generate plurals for many nouns… but it still won't help with most irregular nouns, like "children".


Anyway, the easy way to use WordNet from Python is via NLTK. One of the NLTK HOWTO docs explains the WordNet Interface. (Of course it's even easier to just use NLTK without specifying a corpus, but that's not what you asked for.)

There is a lower-level API to WordNet called pywordnet, but I believe it's no longer maintained (it became the foundation for the NLTK integration), and only works with older versions of Python (maybe 2.7, but not 3.x) and of WordNet (only 2.x).

Alternatively, you can always access the C API by using ctypes or cffi or building custom bindings, or access the Java API by using Jython instead of CPython.

Or, of course, you can call the command-line interface via subprocess.


Anyway, at least on some installations, if you give the simple Morphy interface a singular noun, it will return its plural, while if you give it a plural noun, it will return its singular. So:

from nltk.corpus import wordnet as wn
assert wn.morphy('dogs') == 'dog'
assert wn.morphy('dog') == 'dog'

This isn't actually documented, or even implied, to be true, and in fact it's clearly not true for the OP, so I'm not sure I'd want to rely on it (even if it happens to work on your computer).

The other way around is documented to work, so you could write some rules that apply all possible English plural rules, call morphy on each one, and the first one that returns the starting string is the right plural.

However, the way it's documented to work is effectively by blindly applying the same kind of rules. So, for example, it will properly tell you that doges is not the plural of dog—but not because it knows dogs is the right answer; only because it knows doge is a different word, and it likes the "+s" rule more than the "+es" rule. So, this isn't going to be helpful.

Also, as explained above, it has no rules for any irregular plurals—WordNet has no idea that children and child are related in any way.

Also, wn.morphy('reckless') will return 'reckless' rather than None. If you want that, you'll have to test whether it's a noun first. You can do this just sticking with the same interface, although it's a bit hacky:

def plural(word):
    result = wn.morphy(word)
    noun = wn.morphy(word, wn.NOUN)
    if noun in (word, result):
        return result

To do this properly, you will actually need to add a plurals database instead of trying to trick WordNet into doing something it can't do.

Also, a word can have multiple meanings, and they can have different plurals, and sometimes there are even multiple plurals for the same meaning. So you probably want to start with something like (lemma for s in synsets(word, wn.NOUN) for lemma in s.lemmas if lemma.name == word) and then get all appropriate plurals, instead of just returning "the" plural.

Cerin
  • 60,957
  • 96
  • 316
  • 522
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • 1
    This seems generally correct. However, note that `wn.morphy('dog') == 'dogs'` equates to false on my current installation of NLTK, so it looks like `morphy()` does not appear to generate plural forms. – Cerin Sep 19 '13 at 21:59
  • @Cerin: Well, given that it's not a documented feature of WordNet or NLTK, and that the FAQ explicitly says it can't do this, I guess I shouldn't be surprised that it doesn't work on your computer… I'll edit the answer to say that directly. – abarnert Sep 19 '13 at 22:38
  • @Cerin: I've rolled back your edit, because `morphy('dog')` very clearly returns `'dogs'` on my installation, and the following text explains why you shouldn't rely on that (or the opposite) being true. – abarnert Sep 19 '13 at 22:45
  • Well, it doesn't on my fresh installation of nltk 2.0.4, which is the most recent version. What version do you have installed? Per your own comments, it shouldn't be doing what you claim it's doing. And from morphy's help(), it finds "a possible base form for the given form". 'dogs' is in no way a base form of 'dog'. – Cerin Sep 20 '13 at 13:16
  • 1
    if you're working on english, you dont really have to reinvent the wheel, they've done a well-coded and well-documented tool at http://www.clips.ua.ac.be/pages/pattern-en#pluralization – alvas Sep 22 '13 at 22:53
0

Most current pluralize libraries do not return multiple plurals for some irregular words. Some libraries do not enforce the passed parameter is noun and pluralize a word by general rules. So I decided to build a python library - Plurals and Countable, which is open source on github. The main purpose is to get plurals (yes, mutliple plurals for some words), and has an option to return dictionary approved plurals only. It can also return whether a noun is countable/uncountable or either way.

import plurals_counterable as pluc
pluc.pluc_lookup_plurals('octopus', strict_level='dictionary')

will return a dictionary of the following.

{
    'query': 'octopus', 
    'base': 'octopus', 
    'plural': ['octopuses', 'octopi', 'octopodes'], 
    'countable': 'countable'
}

If you query by a noun's plural, the return also tells which word is its base (singular or plural-only word).

The library actually looks up the words in dictionaries, so it takes some time to request, parse and return. Alternatively, you might use REST API provided by Dictionary.video. You'll need contact admin@dictionary.video to get an API key. The call will be like

import requests
import json
import logging

url = 'https://dictionary.video/api/noun/plurals/octopus?key=YOUR_API_KEY'
response = requests.get(url)
if response.status_code == 200:
    return json.loads(response.text)
else:
    logging.error(url + ' response: status_code[%d]' % response.status_code)
    return None
wholehope
  • 41
  • 4