Skip to content
@cerndb

CERN Database and Analytics Group

Popular repositories

  1. dist-keras dist-keras Public archive

    Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

    Python 624 170

  2. spark-dashboard spark-dashboard Public

    Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

    Dockerfile 89 22

  3. SparkPlugins SparkPlugins Public

    Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics syst…

    Scala 78 15

  4. hdfs-metadata hdfs-metadata Public

    Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks and nodes.

    Java 56 19

  5. SparkDLTrigger SparkDLTrigger Public

    Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"

    Jupyter Notebook 29 12

  6. Hadoop-Profiler Hadoop-Profiler Public

    Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.

    Shell 24 10

Repositories

Showing 10 of 66 repositories

Top languages

Loading…

Most used topics

Loading…