-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
local mode works but not hadoop mode #2
Comments
I believe this is related to infochimps-labs/wukong#11. Wukong::Hadoop::HadoopInvocation#ruby_interpreter_path would appear to be setting the right ruby path but I can't find where that method is called. The only other clue I have is that my stderr job logs also show |
I'm starting to become a bit more convinced that the issue is lying somewhere within how wu-hadoop is loading the environment. I wrote some straight ruby scripts and ran it through hadoop streaming
These worked fine so I'm guessing that the gemfile is somehow not getting loaded or getting unset somehow with normal wu-hadoop. BTW, sorry for the stream of conscious type issue posting but I'm hoping this will be useful to others who are having the same issues and will be useful google fodder. |
Are you running on the 1.0-ish branch of Hadoop (chd3 / 0.20 etc) or the 2.0ish branch (cdh4)? Does Hadoop streaming work for you at all? What do the log files from the child process say (you get those by clicking through the tasks on the job tracker ui, or by drilling into the no -world readable dirs in /var/log/hadoop) Sent from my iPad On Feb 7, 2013, at 11:25 AM, Scott Carleton notifications@github.com wrote:
|
Just figured it out! Of course in the end it was simple. Using 1.1.1 hadoop. I had After adding the rvm script to the end of my hadoop bash environment it appears to be working. What are your thoughts on throwing a warning if the gemfile isn't there? I guess it wouldn't make sense if wukong was installed globally. Not sure if you guys use rvm much but I'd be happy to add a thought to the docs saying 'if you're using rvm, make sure to add it to hadoop env and restart the cluster.' |
tl;dr:
The Design parameters as I see them:
The only two workable solutions I'm aware of are:
Specifying @dhruvbansal would you please revert out that behavior? |
Thanks for the write up. Since I was quite new to map/reduce et al, I wanted to get it all working locally on my macbook before moving into a distributed setting but of course that makes dealing with the idiosyncrasies of my work station a pain as opposed to using chef recipes and getting it right the first time in a controlled environment. However I did learn a lot the hard way :) If it's your intention to allow more developers to get up and running with wukong quickly then designing/implementing an opinionated packer could be a priority, or possibly just some very very comprehensive tutorial docs. Alternative could be a short doc on how to use ironfan to setup a vagrant instance so it would at least be a controlled environment. However, considering all the work involved in just making an awesome ruby wrapper to quickly code data flows and deploy them to a cluster, making sure it works perfectly in a local hadoop cluster should probably not be a priority. |
@mrflip I'd love to have the opinionated packager option working but at present only the network drive approach is really feasible.
@ScotterC did one of the above "supported" use-cases not work for you? I'd like a more robust model, but this is what we've got today. Agreed with @ScotterC that these constraints deserve better documentation. |
@dhruvbansal I guess the supported case that ended up working for me was in hadoop mode without an NFS and munging the path loading to make sure everything needed was there. Just was very confusing due to the finicky nature of hadoop configuration and fully comprehending how it loads up. I believe the process will be much smoother when setup over a network. Local mode is a breeze however which is really the strength of the library which allows you to test code before putting it out there and wasting compute cycles. I'm now going to proceed towards deploying a deploy-pack with ironfan which appears to be fairly well documented here. I'll let you know if I run into difficulties. |
Not sure if this a full fledged issue but I'm posting it here because the google group doesn't appear to be very active.
In short, the examples work fine in local mode but not hadoop mode.
This works as expected:
However, when I switch it over to the single node hadoop cluster which is running locally it fails.
I've put sonnet_18.txt into the hdfs and normal hadoop jar examples such as pi calculation works fine.
Command:
I get
and the job details prints:
This is with hadoop 1.1.1, wukong 3.0.0.pre3 and wukong-hadoop 0.0.2
If anyone has pointers for debugging Java it would be highly appreciated.
1st EDIT:
Found that further down the stack trace there is:
I'm using RVM so my wu-local is located
/Users/ScotterC/.rvm/gems/ruby-1.9.3-p194/bin/wu-local
Writing out full paths for wu-local got me to my next error
Error:
2nd Edit:
Digging through Hadoop job logs gives me
env: ruby_noexec_wrapper: No such file or directory
So I'm wondering why it can't find my ruby implementation. I would assume that's giving the subprocess failed.
The text was updated successfully, but these errors were encountered: