Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDK-29, CDK-466, CDK-827: Include use of CLI flume-config. #79

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

DennisDawson
Copy link
Contributor

No description provided.

@DennisDawson DennisDawson changed the title Include use of CLI flume-config. CDK-29, CDK-466, CDK-827: Include use of CLI flume-config. Mar 5, 2015
@DennisDawson
Copy link
Contributor Author

On the 5.2 VM (for sure), the version of kite-dataset is "0.15.0-cdh5.2.0." That version doesn't include flume-config, which I'm using in the example. In preparing-the-vm.md, I should point to whatever we say in Kite-Install.md. PR 76 (#76) addresses this issue. I'll point to those instructions rather than duplicating them in preparing-the-vm.md.


Complete the following steps to run Kite example code on a Cloudera Quickstart VM.

* Install a VirtualBox or VMWare [Cloudera QuickStart VM][getvm] version 5.2 or later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should note that VirtualBox is recommended. People generally have more problems with VMWare.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we say we recommend one over the other, even if it's true, it calls into question the quality of our VM and ruffles the feathers of a partner. It's fine to say that if one doesn't work, try the other, which covers both bases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@rdblue
Copy link
Contributor

rdblue commented Mar 5, 2015

Overall feedback:

  • I think there generally needs to be more context to tell the reader why he needs to read the page, what he will learn, and what the next steps are. There should be a clear overall picture of what the whole tutorial series does and where each article fits.
  • What the event generator does should be last and is a side discussion. The tutorial should focus on getting interesting events into the dataset and what interesting events are. The generator code isn't in the critical path to understanding the tutorial series, it is simply a helper so the reader has data to work with.
  • The servlet and JSP shouldn't be the focus of the Flume pipeline tutorial. Flume, the configuration, what is happening, and why it is built that way should be the focus of that tutorial.

@rdblue
Copy link
Contributor

rdblue commented Mar 5, 2015

You might find the labs we built recently helpful for the background information. A lot of the Flume pipeline explanation is the same, though the specific example isn't and it doesn't use Log4j to send records.

http://kitesdk.org/docs/0.17.1/labs/6-create-a-flume-pipeline.html


## Configuring the VM

Some Kite examples require Flume. If you use Cloudera Manager, Flume user impersonation is configured for you. If do not use Cloudera Manager, you must update Flume user impersonation in `core-site.xml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add more background on Cloudera Manager. The steps above get you a working VM with Kite, but don't mention CM, while this assumes the user knows about the choice and has already decided whether to use CM. I think a small section to introduce that choice is needed. It should include the fact that we recommend using CM, how to start it (opening the link on the desktop, I believe), and why you would choose not to use it (it requires more memory allocated to the VM).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a section on CM, with the 8GB memory and 2 CPU recommended configurations. This is documented elsewhere and should not be a barrier to publication of our examples, even if we choose to include it.

## Purpose
This lesson describes the steps for configuring a virtual machine to run Kite example code on a Cloudera Quickstart VM.

### Result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prerequisite for this lab is that the user has either VMWare or VirtualBox installed. Links to download those tools would be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VM installation is in the instructions, with a link to the download location.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants