Skip to content

tripadvisor/hive-query-tool

Repository files navigation

The Hive Query Tool

The Hive Query Tool (HQT, for short) is a web-interface for running reports using Apache Hive. It is intended, first and foremost, to be as easy-to-use as possible for non-technical people, while allowing them to customize the reports they run with a high degree of flexibility - well beyond simple variable substitution.

Installation

Installation is actually quite easy, and the HQT builds and runs successfully on a surprising variety of platforms. Still, there are some known prerequisites and you'll have to install & configure those properly on your own.

Requirements & Compatibility

The currently known compatible platforms and system requirements

Compatible Platforms

The HQT has been successfully installed and run on the following platforms:

  • CentOS 5.x (with a newer perl installed under /opt)

  • CentOS 6.x

  • Fedora Core 17 & 18

  • Ubuntu 12.04 thru 13.04

  • Mac OS X 10.8.x (Mountain Lion)

  • Possibly others. Please let us know!

Whatever platform you use, you will have to ensure you have certain other components installed and/or configured in order to install and run the HQT.

System Prerequisites

The base requirements that should be installed and configured on your system prior to installing and/or running the HQT are as follows:

  • A *nix or *nix-like environment

  • Perl 5.10.1 or newer

  • Hadoop and Hive client binaries and libraries

  • A JVM that works with the installed Hadoop and Hive, um... stuff

  • A C Compiler and standard build toolchain (gcc/xcode known to work)

  • OpenSSL libs and development headers

For additional functionality like being able to authenticate users and run jobs as those users, you will also need the following:

  • Sudo

  • LDAP client libraries (and possibly binaries)

  • An LDAP server your users authenticate against

It's very possible there are other requirements. Please, please please file a bug report for anything you needed that isn't described above!

Notes about the Requirements

Perl 5.10.1 or newer is required. It was released in 2009, so it's already fairly old (the most recent released version of perl is 5.18.0 as of my writing this comment in 2013). However, RHEL/CentOS 5.x is stuck with Perl 5.8.8 or thereabouts, which is positively ancient - we're talking 2005-2006!

If you really think you're "stuck", you're not. There are a variety of ways to get a newer perl to make this work. The simplest and safest option is to use perlbrew. Perlbrew automates the entire process of getting, configuring, building, and using any version of perl you want.

You don't need to be root (in fact, I strongly discourage you from installing anything perl-related as root unless it's a package straight from your OS vendor's repos), and you don't even need the CPAN to install it. Instructions are all here: https://metacpan.org/module/App::perlbrew

Obtaining the Code

The official main source-code repository for the HQT is currently on GitHub under the TripAdvisor Organization, specifically... here: https://github.com/tripadvisor/hive-query-tool

You very well might be reading that web page right now :)

You can download any version, revision, or branch you wish in a multitude of ways, using git, svn, or just download via HTTP as a compressed archive, which you can then unpack locally.

Setup/Installation Procedure

Please do *not* do any of this as the root or admin user of your system. While the HQT setup script does its best to install all of the CPAN modules it depends on in a way that is isolated and therefore should have no effect on the system perl, installing as root is unnecessary and always risky. In most *nix-like environments, too many things depend on the system perl being "just so" and if something changes it, fixing the problem can be a nightmare.

As you were.

Start on a system that meets the minimum HQT requirements described above, and as a non-root user obtain and/or unpack the source code and enter the top-level directory, where you will find a copy of this very file (if you're not already reading that copy!)

Run the setup script

This script will basically do almost everything. You'll need a connection to the internet because it downloads all the HQT's dependencies from the CPAN. You won't even need a CPAN client already installed, as this script will download one to bootstrap the whole process.

If you run:

./hqt-setup

It will check if the default perl on your system is new enough and if it isn't it will look in a few common places for one. If it it can't find a new enough version of perl on its own, it will give up and exit with an error.

If you want to use a specific installation of perl, for example, one you installed using perlbrew then pass the path to that perl binary as an option to the setup script, like so:

./hqt-setup ~/perl5/perlbrew/perls/perl-5.16.3/bin/perl

Go get yourself a cup of coffee. Maybe order a pizza. The first run can take a while as the entire dependency chain is resolved, downloaded, built, tested and installed. We've pared back some of the direct dependencies though, so it's now a lot less time than it used to be.

Once done, it's usually good to run it again, just to make sure there's really nothing more it has to install. If it succeeds without installing anything then you should be all set!

If the setup script did not succeed and running it repeatedly makes no difference, please file a bug report on GitHub.

Configuration

TODO: describe what needs to go in conf/hqt_config.yaml

Running

TODO: describe how to run the frontend with start-frontend and the backend with start-backend and the options they may take...

More Information

More information can be found in the POD of the various modules, which, if you source hqt-setup you can read with either man or perldoc.

Also, I may end up putting info in the wiki GitHub provides for the project.

Basically, the documentation is still horribly incomplete and hard to find. We're working on it, though. (patches welcome!)

Credits

Thanks to everybody who's helped develop the HQT, whether it be through code, documentation, bug reports or feature requests!

~ Stephen R. Scaffidi, initial author and current primary maintainer

Special thanks to:

  • Rashmi Nayak, lots and *lots* of code, especially web stuff

  • Bill Langenberg, code, documentation, feature requests, and more

Copyright 2013 TripAdvisor, LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

About

A web interface to Hive with flexible, user-friendly query customization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published