-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent pos argument between wn.synsets() and WordNetLemmatizer.lemmatize() #1978
Comments
@alvations , I think we should the map the expected behavior for None same as that of pos= 'n' . This is because if we don't pass the pos argument to the WordNetLemmatizer , the pos value of noun is automatically assumed |
@alvations and @53X, a more consistent interpretation of pos=None could be nice, but in that case, the default should not be "n", but rather "Any pos". Please consider the morphy() wrapper in corpus/reader/wordnet.py: it uses itertools.chain to collect the lemmas from all the possible pos'es, and that is the behaviour users would normally expect when no particular pos is specified. On the contrary, a user who wants only nouns would specify pos="n". |
Ideally, to get a consistent behaviour across the Wordnet Morphy-related wrappers, "WordNetLemmatizer.lemmatizer()" could just be an alias for the morphy() wrapper from wordnet.py. Actually, I find that the name "WordNetLemmatizer" is not adequate, since this wrapper eventually undoes the WordNet filtering done by _morphy(), and ends up just accepting any garbage input. So although "WordNetLemmatizer" uses _morphy(), it is unfortunate if it is perceived as a canonical wrapper for it. |
PR #3225 proposes to add two standard "morphy" modes to the WordNetLemmatizer class, for users who want a standard morphy lemmatizer with a more consistent pos argument. |
Currently, there's some inconsistency of how POS is treated in
wn.synsets()
andWordNetLemmatizer.lemmatize()
, e.g.I'm not sure how to allow
None
toWordNetLemmatizer.lemmatize()
though.What should be the expected behavior of
pos=None
, default topos='n'
? If so, then we can make changes at https://github.com/nltk/nltk/blob/develop/nltk/stem/wordnet.py#L39:The text was updated successfully, but these errors were encountered: