Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List index out of range #58

Open
MaxKe99 opened this issue Jan 4, 2021 · 2 comments
Open

List index out of range #58

MaxKe99 opened this issue Jan 4, 2021 · 2 comments

Comments

@MaxKe99
Copy link

MaxKe99 commented Jan 4, 2021

Describe the bug
I tried to run the example parse_from_newsplease.py.
When attempting to extract the top answer for all 6 questions, I receive a list index out of range error, similar to #36.
Sadly, his proposed fix does not work in my case.

grafik

The error doesn't occur when trying to only extract Who, What and When.

To Reproduce
I used the code from parse_from_newsplease.py and added a few lines to extract and print answers for all 6 questions.
I installed Giveme5W1H through pip.

questions = ['who', 'what', 'when', 'where', 'why', 'how']
for q in questions:
answers.append(doc.get_top_answer(q).get_parts_as_text())
for i in range(len(answers)):
print(answers[i])

Expected behavior

I expected to receive all six answers.

Versions (please complete the following information):

@Lolologist
Copy link

I am having the same problem, all the same versions with exception being I'm on a Mac.

@TitasDas
Copy link

TitasDas commented Jun 20, 2021

The error actually has nothing to do with lines 150-151 of document.py as suggested in #36

def get_top_answer(self, question):
        return self.get_answers(question=question)[0]

Please leave those lines unchanged. It basically means that there isn't an answer for that question for the text that is being given to the extractor.

I would suggest using try, except, else blocks for each of the questions as shown below to see which question is not being answered.

    try:
    	who_answer = doc.get_top_answer('who').get_parts_as_text()
    except IndexError:
    	print("An answer for 'who' doesn't exist for this piece of text")
    else:
    	print("Who :", who_answer)

Similarly in the example given in parse_single_from_code.py , when you try using the lead or title short which have very little text content you may get the same error. But for text , you will see that all the questions are answered and you don't encounter this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants