Scala/Spark Library for Server Logs Analytics

Requirements:

Download Spark 2.1+
Java JDK8
sbt (to build the project)

Build the artifact

sbt package

Choose/modify/create a config file

The config file contains where the log files are kept and where the parquet folder (structured file) should be written

Example for STRING. In this example we use /scratch/local but we could use /scratch/cluster if we wanted to run in the cluster

name: STRING

#Directory where to find the log files
logDirectory: /scratch/local/weekly/dteixeir/string-logs/*

#Directory where to output or read the parquet file
parquetFile: /scratch/local/weekly/dteixeir/string-parquet/

Run the application

./start.sh configs/oma-config.yaml

Choose the appropriated option (option 2 and 3, requires option 1 parquet)

Using config file configs/oma-config.yaml
---
1) Convert Parquet
2) Insights Report
3) Distinct IPs
4) Quit
`

Parquet (Required)

Option 1, Convert Parequet is required to proceed further. This converstion will convert the "raw log files" to a structured / indexed format for fast analysis.

Insights report

This option will generate a report to be included in Insights

Distinct IPs

This will produce a file with all distinct IPs

script insights - genereate

spark-shell $SPARK_SCRIPT_MEMORY -i scripts/analyse.scala

DRAFT - Optional: Run the analysis on a cluster

(The documentation below is not ready)

$SPARK_HOME/bin/spark-submit --class org.elixir.insights.server.logs.ServerLogAnalyser --master local[4] target/scala-2.11/server-log-analytics_2.11-1.0.jar

32 cores 100GB of memory

./start-slave.sh -c 32 -m 100G spark://rserv01.vital-it.ch:7077

Took 2.5 hours

spark-shell --executor-memory 100G --master spark://rserv01.vital-it.ch:7077 -i analyse.scala

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
configs		configs
custom/uniprot		custom/uniprot
distinct-ips		distinct-ips
insights-reports		insights-reports
lib		lib
project		project
scripts		scripts
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.txt		LICENSE.txt
README.md		README.md
build.sbt		build.sbt
default-config.yaml		default-config.yaml
dimensions-dictionary.properties		dimensions-dictionary.properties
start.sh		start.sh

License

sib-swiss/server-log-analytics

Folders and files

Latest commit

History

Repository files navigation

Scala/Spark Library for Server Logs Analytics

Build the artifact

Choose/modify/create a config file

Run the application

Parquet (Required)

Insights report

Distinct IPs

script insights - genereate

DRAFT - Optional: Run the analysis on a cluster

About

Topics

Resources

License

Stars

Watchers

Forks

Languages