Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: Java command failed when using stanford parser example #1239

Closed
yuvval opened this issue Dec 25, 2015 · 18 comments
Closed

OSError: Java command failed when using stanford parser example #1239

yuvval opened this issue Dec 25, 2015 · 18 comments

Comments

@yuvval
Copy link

yuvval commented Dec 25, 2015

Hi,

I am trying to run the stanford parser example. E.g.

from nltk.parse.stanford import * 
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
[parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]

executing the last command results with an error:

OSError: Java command failed : [u'/usr/bin/java', u'-mx1000m', '-cp', ....

when I reproduce the same command on the command line, I get the error Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory

Therefore, after adding slf4j-api.jar to the classpath on the commandline, parsing is successful.

How can slf4j-api.jar be added to nltk classpath, so parsing will be successful?

Thank you!
Happy holidays

@alvations
Copy link
Contributor

@yuvval Just to be sure are you using Stanford Parser version 2015-12-09? If so, this error occurs because of the new StanfordNLP using more dependencies than before. This is similar to #1237

You would have to wait for a while before #1237 is fixed and NLTK catches up with Standford tools.

The quick fix solution is to either:

  1. use the previous version 2015-04-20 from http://nlp.stanford.edu/software/stanford-parser-full-2015-04-20.zip and the NLTK API would work, see http://stackoverflow.com/questions/13883277/stanford-parser-and-nltk/34112695#34112695 or
  2. hack the stanford parser classpath:
from nltk.internals import find_jars_within_path
from nltk.parse.stanford import StanfordDependencyParser
dep_parser=StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
stanford_dir = st._stanford_jar.rpartition('/')[0]
# or in windows comment the line above and uncomment the one below:
#stanford_dir = st._stanford_jar.rpartition("\\")[0]
stanford_jars = find_jars_within_path(stanford_dir)
st.stanford_jar = ':'.join(stanford_jars)
[parse.tree() for parse in dep_parser.raw_parse("The quick brown fox jumps over the lazy dog.")]

@yuvval
Copy link
Author

yuvval commented Dec 25, 2015

Thank you! It works with the 2015-04-20 version.

@yuvval yuvval closed this as completed Dec 25, 2015
@alvations
Copy link
Contributor

Did the classpath hack also work?

@yuvval
Copy link
Author

yuvval commented Dec 25, 2015

I didn't try - I just deleted the latest version and downloaded the 2015-04-20 version.

@cschwem2er
Copy link

Hi! I tried to follow your hack but for me there is no `StanfordDependencyParser``:

print(nltk.__version__)
from nltk.tag import StanfordDependencyParser

3.1
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-7-67bb74c3494a> in <module>()
----> 1 from nltk.tag import StanfordDependencyParser

ImportError: cannot import name 'StanfordDependencyParser'

Any idea how to solve this? I would really like to use the latest stanford version.

@alvations
Copy link
Contributor

@methodds Pardon my typo, it's from nltk.parse.stanford import StanfordDependencyParser. Please see https://gist.github.com/alvations/e1df0ba227e542955a8a for detailed explanations.

@cschwem2er
Copy link

Thank you for the link. Unfortunately, I can't get the environment variables to work on my linux mint os.

My bashrc looks like this:

export JAVA_HOME="/usr/lib/jvm/java-8-oracle/"
export PATH=$JAVA_HOME/bin:$PATH

export CLASSPATH="/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/stanford-postagger.jar:$CLASSPATH"

export CLASSPATH="/home/cs/stanford_nlp/stanford-ner-2015-04-20/stanford-ner.jar:$CLASSPATH"

export STANFORD_MODELS="/home/cs/stanford_nlp/stanford-ner-2015-04-20/classifiers:$STANFORD_MODELS"

export STANFORD_MODELS="/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/models:$STANFORD_MODELS"

Echoing the variables looks right:

echo $CLASSPATH
/home/cs/stanford_nlp/stanford-ner-2015-04-20/stanford-ner.jar:/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/stanford-postagger.jar

echo $STANFORD_MODELS
/home/cs/stanford_nlp/stanford-postagger-full-2015-04-20/models:/home/cs/stanford_nlp/stanford-ner-2015-04-20/classifiers

However (even after rebooting) NLTK still does not find the tagger:

from nltk.tag.stanford import StanfordPOSTagger
st = StanfordPOSTagger('english-bidirectional-distsim.tagger')
st.tag('What is the airspeed of an unladen swallow ?'.split())

NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH
environment variable.

@alvations
Copy link
Contributor

Do source .bashrc and it will work meanwhile take a look at http://apple.stackexchange.com/questions/12993/why-doesnt-bashrc-run-automatically to learn how bashrc works.

@cschwem2er
Copy link

Thank you for your tip, but I did source .bashrc beforehand and it did not work. I tried it again and unfortunately it's still not working.

@alvations
Copy link
Contributor

What is your Linux distribution and version? Can you do a lsb_release -a? Or are you working with a Mac?

@cschwem2er
Copy link

Thank you for investigating. ``lsb_release -a` returns

No LSB modules are available.
Distributor ID: LinuxMint
Description:    Linux Mint 17.3 Rosa
Release:    17.3
Codename:   rosa

@alvations
Copy link
Contributor

  • Where did you do the export commands? Which directory?
  • Where are you running your python scripts? Which directory?

Go to the place where you want to run your python script, do this: import os; print os.environ.

Then go to your home directory, start python and do the same: import os; print os.environ

Do you see the 2 sets of environment variables differ?

@cschwem2er
Copy link

I guess you wanted me to use import os; print(os.environ), which did not reveal the environment variables that I exported in .bashrc. After that I copy pasted the content into .profile (in my home folder) and now it works perfectly. I have no idea why though =D.

@alvations
Copy link
Contributor

Glad that .profile works, i think it's a OS distro issue. I would not recommend to store the environment variables as static, personally, I rerun them everytime I start my python scripts, so that I can sure that there's no conflict. Have fun with the NLTK API and Stanford tools!

@cschwem2er
Copy link

Thank you :)

@hansen7
Copy link

hansen7 commented Dec 21, 2016

what is 'st' in the command 'stanford_dir = st._stanford_jar.rpartition('/')[0]'

@KazutoshiShinoda
Copy link

I have the same question as hansen7

@katkamrachanaso
Copy link

for few who have been looking what is st,
st = StanfordNERTagger(os.environ.get('STANFORD_MODELS'))
Ref: https://gist.github.com/manashmndl/810db10809cbc1209b34c7d25efe95d5#file-stanfordnertagger-py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants