Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"IndexError: string index out of range" on trying to stem the word "oing" #1614

Closed
peterbe opened this issue Feb 8, 2017 · 5 comments
Closed

Comments

@peterbe
Copy link

peterbe commented Feb 8, 2017

Easy to reproduce:

>>> from nltk import PorterStemmer
>>> stemmer = PorterStemmer()
>>> stemmer.stem('oing')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 665, in stem
    stem = self._step1b(stem)
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 376, in _step1b
    lambda stem: (self._measure(stem) == 1 and
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 258, in _apply_rule_list
    if suffix == '*d' and self._ends_double_consonant(word):
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 214, in _ends_double_consonant
    word[-1] == word[-2] and
IndexError: string index out of range
>>> import nltk
>>> nltk.__version__
'3.2.2'
@peterhurford
Copy link

I got this error for the word "aed":

from nltk.stem.porter import PorterStemmer
from nltk.corpus import stopwords
stemmer = PorterStemmer()
stemmer.stem('aed')

Error is:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 665, in stem
    stem = self._step1b(stem)
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 376, in _step1b
    lambda stem: (self._measure(stem) == 1 and
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 258, in _apply_rule_list
    if suffix == '*d' and self._ends_double_consonant(word):
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 214, in _ends_double_consonant
    word[-1] == word[-2] and
IndexError: string index out of range

Installed with:

pip install nltk
python -m nltk.downloader -d

Version:

import nltk
nltk.__version__ # '3.2.2'

@ExplodingCabbage
Copy link
Contributor

ExplodingCabbage commented Feb 9, 2017

Duplicate of #1581. My fault; sorry. :(

The bug was introduced in version 3.2.2 and is fixed on master; you can either use develop or version 3.2.1 to get rid of the bug.

@jayvdb
Copy link
Contributor

jayvdb commented Mar 29, 2017

Close as fixed?

@g-laz77
Copy link

g-laz77 commented May 8, 2017

So, has the issue been resolved?

@alvations
Copy link
Contributor

This issue should have been resolved by #1582 😉

>>> import nltk
>>> nltk.__version__
'3.2.5'

>>> from nltk import PorterStemmer
>>> porter = PorterStemmer()
>>> porter.stem('oing')
u'o'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants