-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoreNLPNERTagger throws HTTPError: 500 Server Error: Internal Server Error for url: ...... #2010
Comments
Hi, do you see any errors coming from the CoreNLP log? |
Yes. CoreNLPPOSTagger worked as expected with no error. The error message when I ran CoreNLPNERTagger is Thanks. |
This looks like the key error on the CoreNLP side. Did you try to tag the sentence via the web interface on http://localhost:9000 |
Hi Dmitrijs, Thanks for pointing this out. I guess it's on the CoreNLP side. I tried several texts with person names and none of them worked in the live demo at the point. But I remember the demo site worked last week. Moving forward, if NLTK only provides wrappers for CoreNLP then the users have to worry about the server. Do you think it would be a good idea if we can keep StanfordNERTagger or something similar in the new version? Thank you. |
Actually, we should just deprecate the Stanford APIs in NLTK and only wrap around https://github.com/stanfordnlp/python-stanford-corenlp But that'll require some work to clean out, wrap, merge the API's with NLTK's objects and test. Anyone up for a challenge? |
@hexingren Please try the following with NLTK v3.3. Please use the new First update your NLTK:
Then still in terminal:
Finally, start Python: >>> from nltk.parse import CoreNLPParser
>>> parser = CoreNLPParser(url='http://localhost:9000')
>>> list(parser.parse(['house', ')', 'is', 'in', 'York', 'Avenue']))
[Tree('ROOT', [Tree('S', [Tree('NP', [Tree('NN', ['house']), Tree('-RRB-', ['-RRB-'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('PP', [Tree('IN', ['in']), Tree('NP', [Tree('NNP', ['York']), Tree('NNP', ['Avenue'])])])])])])]
>>> tagger = CoreNLPParser(url='http://localhost:9000', tagtype='ner')
>>> tokens = 'Rami Eid is studying at Stony Brook University in NY'.split()
>>> tagger.tag(tokens)
[('Rami', 'PERSON'), ('Eid', 'PERSON'), ('is', 'O'), ('studying', 'O'), ('at', 'O'), ('Stony', 'ORGANIZATION'), ('Brook', 'ORGANIZATION'), ('University', 'ORGANIZATION'), ('in', 'O'), ('NY', 'STATE_OR_PROVINCE')] Are you still getting the error with the above? |
Closing the issue as resolved for now =) |
There is some similar error occuring |
@Bisht9887 would you be able to share the dataset and we'll test what happened? If not, could you post the full stacktrace of the error as well as the output on the console for the Stanford CoreNLP server? |
HTTPError Traceback (most recent call last) in name_extracter() ~\Anaconda3\lib\site-packages\nltk\parse\corenlp.py in tag(self, sentence) ~\Anaconda3\lib\site-packages\nltk\parse\corenlp.py in tag_sents(self, sentences) ~\Anaconda3\lib\site-packages\nltk\parse\corenlp.py in (.0) ~\Anaconda3\lib\site-packages\nltk\parse\corenlp.py in raw_tag_sents(self, sentences) ~\Anaconda3\lib\site-packages\nltk\parse\corenlp.py in api_call(self, data, properties) ~\Anaconda3\lib\site-packages\requests\models.py in raise_for_status(self) HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cssplit%2Cner%22%2C+%22ssplit.isOneSentence%22%3A+%22true%22%7D |
The data is somewhat this kind of. So I have like 400 text files contaning data similar to as shown below . I am parsing every text file and every line and I am passing the text after 'patient name:' to NER. patient name: Johny, Rick Performed: Due: 21Mar2018; Last Updated By: Morgan; |
@hexingren do you know which is the line that caused the error? Before Due to the nature of the dataset, I hope the sample in the previous is anonymized. Or at least changed into some fictional names. BTW, if the data is so structured as shown above, there's really no need for an NER ;P |
@alvations : Thank you! The issue has been resolved as there were some empty tokens getting passed to NER, so now I have put a check for them. |
I didn't further try this wrapper in April but was inspired by this thread. It had something to do with |
@hexingren There isn't any issue when I ran the code from #2010 (comment) through a size-able corpus. @hexingren Could you do a quick check on your data and see whether you are still having the same problem the 500 Server Error? Thanks in advance! The issue @Bisht9887 was raising was because of an empty string. In that case, I think the API fails. @dimazest Maybe we should catch empty strings and return empty |
@alvations I didn't use additional data. I was trying the example code in NLTK v3.2.5 and it didn't work on my machine. If the example code works in v3.3 now, then that's great! Thanks. |
It should work in v3.3. Here's the updated docs https://github.com/nltk/nltk/wiki/Stanford-CoreNLP-API-in-NLTK =) |
|
@alvations @dimazest I am also facing similar issue error requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22annotators%22%3A+%22tokenize%2Cssplit%2Cner%22%2C+%22ssplit.isOneSentence%22%3A+%22true%22%2C+%22outputFormat%22%3A+%22json%22%7D. And I have to agree with @JohnnyLim, what I observe is when I sent in the text as list which had more than 100 list items. It threw the error immediately but when I sent the first 5 list items, it threw the error after printing the result for first 4 list items. Below is the full error which I got when I run the NER tagger with the API. NLTK == 3.4. java.util.concurrent.TimeoutException Could you please let know if there is any workaround for this issue because as I am planning to use the NER tagger for much larger amount of text and just trying some POC initially. Any input in this regard is much appreciated. Adding to this when i when to the GUI for the API I got this error "CoreNLP request timed out. Your document may be too long.". Thanks, |
I've updated the wiki page https://github.com/nltk/nltk/wiki/Stanford-CoreNLP-API-in-NLTK/_compare/3d64e56bede5e6d93502360f2fcd286b633cbdb9...f33be8b06094dae21f1437a6cb634f86ad7d83f7 though, it might worth putting this information into NLTK documentation to avoid documentation spread over several source. |
I am not sure why you are saying that the issue was resolved. Does not work for me. I used this link: https://stackoverflow.com/questions/52031337/stanfords-corenlp-name-entity-recogniser-throwing-error-500-server-error-inter and , unfortunately, it is not helpful at all. I still get the error. It works for any other tagging operations (like pos tagging) and it works for everything else. I also don't think it has anything to do with text as the ner tagging does not work at all for any text and sentences. I am sure I correctly followed the instructions to load the server: java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \
and had no issues with loading it. Here is the code that I used: tagger = CoreNLPParser(url='http://localhost:9000', tagtype='ner') tokens = text.split() Here is what I got: HTTPError Traceback (most recent call last) /Applications/anaconda3/lib/python3.7/site-packages/nltk/parse/corenlp.py in tag(self, sentence) /Applications/anaconda3/lib/python3.7/site-packages/nltk/parse/corenlp.py in tag_sents(self, sentences) /Applications/anaconda3/lib/python3.7/site-packages/nltk/parse/corenlp.py in (.0) /Applications/anaconda3/lib/python3.7/site-packages/nltk/parse/corenlp.py in raw_tag_sents(self, sentences) /Applications/anaconda3/lib/python3.7/site-packages/nltk/parse/corenlp.py in api_call(self, data, properties, timeout) /Applications/anaconda3/lib/python3.7/site-packages/requests/models.py in raise_for_status(self) HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cssplit%2Cner%22%2C+%22ssplit.isOneSentence%22%3A+%22true%22%7D |
Hi everyone, I am getting the same error after 155 rows - the first 155 rows works fine but after 155 rows I get the error
This is quite strange, because I did not have this error when using |
Hello @ehsong, Consider adding if you are getting |
Hello,
I'm using nltk v3.2.5 and try to use CoreNLPNERTagger with both Stanford CoreNLP v3.9.1 (the latest version) and v3.8.0. However, they both throw an HTTPError: 500 Server Error.
The code is
"""
from nltk.tag.stanford import CoreNLPPOSTagger, CoreNLPNERTagger
CoreNLPPOSTagger(url='http://localhost:9000').tag('What is the airspeed of an unladen swallow ?'.split())
CoreNLPNERTagger(url='http://localhost:9000').tag('Rami Eid is studying at Stony Brook University in NY.'.split())
"""
CoreNLPPOSTagger was able to give the expected result, so I guess I set up the server correctly. The error message for CoreNLPNERTagger is
"""
HTTPError Traceback (most recent call last)
in ()
----> 1 CoreNLPNERTagger(url='http://localhost:9000').tag('Rami Eid is studying at Stony Brook University in NY.'.split())
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tag\stanford.py in tag(self, sentence)
229
230 def tag(self, sentence):
--> 231 return self.tag_sents([sentence])[0]
232
233 def raw_tag_sents(self, sentences):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tag\stanford.py in tag_sents(self, sentences)
225 # Converting list(list(str)) -> list(str)
226 sentences = (' '.join(words) for words in sentences)
--> 227 return list(self.raw_tag_sents(sentences))
228
229
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\tag\stanford.py in raw_tag_sents(self, sentences)
242 default_properties['annotators'] += self.tagtype
243 for sentence in sentences:
--> 244 tagged_data = self.api_call(sentence, properties=default_properties)
245 assert len(tagged_data['sentences']) == 1
246 # Taggers only need to return 1-best sentence.
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\parse\corenlp.py in api_call(self, data, properties)
249 )
250
--> 251 response.raise_for_status()
252
253 return response.json()
~\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\models.py in raise_for_status(self)
933
934 if http_error_msg:
--> 935 raise HTTPError(http_error_msg, response=self)
936
937 def close(self):
HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cssplit%2Cner%22%2C+%22ssplit.isOneSentence%22%3A+%22true%22%7D
"""
Could anyone point out what happened here? Thanks!
The text was updated successfully, but these errors were encountered: