Skip to content

Analyzing and detecting anomalies in S3 Data using Athena JDBC Driver

Notifications You must be signed in to change notification settings

tewfik-ghariani/cloud-storage-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cloud Storage Analyzer project.

Description

A web application providing features such as data aggregation, anomalies detection and data inspection of files residing in AWS S3 buckets The tool leverages AWS Athena by using its JDBC driver

  • Creation 01 March 2017
  • Author Tewfik Ghariani

Configuration

Hi there!

Before starting to use the application, you have to set your environment. Don't worry, you need to follow these steps and you'll be ready!

Python 3

JAVA 8

Pip

$ sudo apt-get install python3-pip
$ sudo pip3 install virtualenv

Virtual Environment

$ virtualenv venv

Activate your virtual environment, (You can turn it off afterwards by typing simply deactivate)

$ source venv/bin/activate

Postgres:

PostgreSQL 9.4 version is mandatory since the application uses the JsonB object type.

$ sudo apt-get install postgresql-9.4
$ sudo su - postgres
$ psql
CREATE DATABASE athena_db;
CREATE USER si_aps WITH PASSWORD {% _<password>_ %} ;
ALTER ROLE si_aps SET client_encoding TO 'utf8';
ALTER ROLE si_aps SET default_transaction_isolation TO 'read committed';
ALTER ROLE si_aps SET timezone TO 'UTC';
GRANT ALL PRIVILEGES ON DATABASE athena_db TO si_aps;
\q
$ exit

Ref : Django + Postgres

Bower Configuration

$ sudo apt-get install npm
$ sudo  npm install -g bower

In case of error:

$ sudo ln -s /usr/bin/bower /usr/local/bin/bower

or

$ sudo ln -s /usr/bin/nodejs /usr/bin/node

Python Dependencies

$ pip install -r requirements.txt

Front-end Dependencies

$ python manage.py bower update

DataBase Migration

$ python manage.py makemigrations
$ python manage.py migrate

$ python manage.py collectstatic (Only in prod)

Usage

Admin account

$ python manage.py createsuperuser

Run the server!

$ python manage.py runserver

http://127.0.0.1:8000/

Your environment is set up Follow these instructions to use our prototype and contribute in the development process of our analyzer

About

Analyzing and detecting anomalies in S3 Data using Athena JDBC Driver

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published