Skip to content

Big data course at CRI cri-paris.org (authors: Marc, Loic, Liubov, Anirudh, Felix)

Notifications You must be signed in to change notification settings

Big-data-course-CRI/materials_big_data_cri_2019

 
 

Repository files navigation

Big data course 2019

Authors: Marc & Liubov (network theory) | Anirudh & Felix (Big data in mental health) | Loic (data management)

Marc Santolini: marc.santolini@cri-paris.org

Liubov Tupikina: liubov.tupikina@cri-paris.org

This is the repository of the CRI Digital Master "Big data" course for Fall 2019.

Course description on network theory

(Marc, Liubov)

This course will provide an introduction to the field of big data, with a focus on network data and data for mental health. Topics will cover data project management, infrastructure of big data, data analysis and visualisation, and mental health data. The course will be divided into a big data and a network data parts.

Why focus on network data? Over the past century, network studies have had significant impact in disciplines as varied as mathematics, sociology, physics, biology, computer science or quantitative geography, giving birth to Network Science as a field of itself. With the recent rise of social networks in the last decade, their use has now become widespread in the digital world. Here we will provide an introduction to the field of Network Science, from the theoretical foundations (generating, analysing, perturbing networks) to the practical hands-on part (analysis and visualisation of a real-world networks).

Evaluation of the network theory course

  1. Reversed classroom topics https://docs.google.com/document/d/1fWYlafN2GUoiqX-tuVJpsXAeqlY3HqhVrY9SRSqxroM/edit?usp=sharing on 11 December 2019
  2. Projects (see github folder with template of notebook) on 18 December 2019 All projects from students are located in folders here https://github.com/Big-data-course-CRI

Network topics of the course will cover

  1. How to construct networks from real data?
  2. How to analyze networks? (centrality measures, community detection, statistical analyses etc.)
  3. How to visualise networks?
  4. Dynamics and spreading phenomena on networks (epidemics / information spreading, diffusion)
  5. How do networks wirings change in time? (network robustness, temporal networks)
  6. How to represent more complex network data? Multilayer, multiplex networks.

Students will select, analyse and present a network of their choice as part of a personal project for the course. They will also choose an advanced topic in network science & big data for which they will make a presentation in a reverse classroom setting. They will in particular contribute to a wikipedia page about that topic.

Data Efforts in the Mental Health part:

(Anirudh, Felix)

In this part, students will be presented with topics related to the infrastructure of ‘big data’. They will be introduced to barriers, current trends, types, protocols and importance of ‘big data’ collection in the sphere of mental health, specifically through the (i) Healthy Brain Network project for 10000 children collecting and sharing neuroimaging & phenotypic data. Students will also contribute to the development of (ii) A Linked Semantic Mental Health Database and scientific framework mapping signs, symptoms and behaviors to subjective and objective measures, projects and technologies (https://github.com/ChildMindInstitute/mhdb/wiki) (iii) MindLogger Data Collection Platform & App to dramatically improve the convenience, consistency, efficiency, accuracy & analysis of widely distributed data efforts (https://mindlogger.org/)

Students will then spend the last part of the course working on a research project developing and applying digital tools related to ‘big data’ and mental health, using the skills obtained from the first part of the course.

Data management

(Loic)

Resources

Introductory material on networks:

(Marc, Liubov)

Big Data & Mental Health:

(Felix, Anirudh)

Topics and ideas for the projects

Collaborations with other existing projects

Criteria for projects

  1. loading or collecting your own network data
  2. analyzing data from your network and answering your research projects
  3. documenting your project (in README files of the project)

How To Use Github

https://guides.github.com/activities/hello-world/

To push your folder with project to existing Github repository https://help.github.com/en/github/using-git/pushing-commits-to-a-remote-repository

Let us know about your github and topics you are interested: https://docs.google.com/spreadsheets/d/1dsMScv5jodScJQ8dXe66ZbOLGK3o8-cq8KlrHdB2Nr8/edit?usp=sharing.

About

Big data course at CRI cri-paris.org (authors: Marc, Loic, Liubov, Anirudh, Felix)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 80.9%
  • HTML 19.1%