Skip to content

kamireddym28/Log-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Log Analysis Project -- FSDN Project3

The objective of this project is to analyze and query "News" database inorder to generate log results.

Requirements

Usage

  • Install either of the python versions from the URL provided
  • Install the VirtualMachine(VM) to run an SQL database server using the tools provided*
  • Download the database and unzip it
  • Place the unziped file (newsdata.sql) in vagrant folder on local computer
  • Bring the VM online
  • Load the database by cd into the vagrant directory
  • Set the path to the newsdata.sql inside the vagrant directory
  • Execute the newsdatadb.py file using python newsdatadb.py
  • Log results can be found in results.txt on succesful execution .py file
  • Use CTRL+D to exit the VM

Description of files

  • newsdatadb.py : Code to analyze and generate log results on "News" DB.
  • results.txt : Output file to which analyzed log results are written
  • newsdata.sql : News DB containing "articles" , "authors" and "log" tables

Commands

  • Bring the VM online using vagrant up
  • Log into VM using vagrant ssh
  • Load the database using psql -d news -f newsdata.sql
  • import psycopg2 is a module to connect to *** PostgreSQL *** , an open source RDBMS

VIEWS Created**

  • Part 1:
   CREATE OR REPLACE VIEW popular_article as
   SELECT articles.slug as most_popular_articles, COUNT(*) as views
   FROM log, articles
   WHERE CONCAT('/article/',articles.slug)=log.path
   GROUP BY articles.slug
   ORDER BY views DESC limit 3;
  • Part 2:
   CREATE OR REPLACE VIEW popular_article_authors as
   SELECT authors.name,count(*) as views
   FROM authors, articles,log
   WHERE articles.author=authors.id
   AND CONCAT('/article/',articles.slug)=log.path
   GROUP BY authors.name
   ORDER BY views DESC;
  • Part 3:
   CREATE OR REPLACE VIEW error_log as
   SELECT (date(time)) as unique_date, count(*) as error
   FROM log
   WHERE status like '%4%'
   GROUP BY unique_date
   ORDER BY error DESC;

   CREATE OR REPLACE VIEW total_log as
   SELECT (date(time)) as unique_date, count(*) as status
   FROM log
   GROUP BY unique_date
   ORDER BY status DESC;

   CREATE OR REPLACE VIEW percent_error_requests as
   SELECT total_log.unique_date,
   ROUND((100.0/(total_log.status/error_log.error)),2) as error_percent
   FROM total_log, error_log
   WHERE total_log.unique_date=error_log.unique_date
   GROUP BY total_log.unique_date, total_log.status, error_log.error
   ORDER BY error_percent DESC;

Courtesy

News database (newsdata.sql) was provided by Udacity

Note:

Vagrant and VirtualBox are the tools to install and manage the VM*

Reference "newsdatadb.py" to find the usage of the created VIEWs**

About

Applied SQL skills to a reporting tool that summarizes data from a large database.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages