Skip to content

Python project for the Algorithmic Methods of Data Science class for the MSc. in Data Science at the Sapienza University of Rome. The main purpose of the project is exploring Network concepts like Shortest Walk, Min Cut, Densest Subgraph, etc. and building algorithms to explore these concepts.

License

Notifications You must be signed in to change notification settings

msancor/ADM-HW5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Algorithmic Methods for Data Mining - Homework 5

This is a Github repository created to submit the fifth Homework of the Algorithmic Methods for Data Mining (ADM) course for the MSc. in Data Science at the Sapienza University of Rome.


What's inside this repository?

  1. README.md: A markdown file that explains the content of the repository.

  2. main.ipynb: A Jupyter Notebook file containing all the relevant exercises and reports belonging to the homework questions, the Command Line Question, and the Algorithmic Question.

  3. modules/: A folder including 4 Python modules used to solve the exercises in main.ipynb. The files included are:

    • __init__.py: A init file that allows us to import the modules into our Jupyter Notebook.

    • data_handler.py: A Python file including a DataHandler class designed to handle data cleaning and feature engineering on Kaggle's Citation Network Dataset.

    • backend.py: A Python file including a Backend class designed to build 5 functionalities to solve the exercises from the homework.

    • frontend.py: A Python file including a Frontend class designed to visualize the 5 functionalities of the Backend to solve the exercises from the homework..

  4. commandline.sh: A bash script including the code to solve the Command Line Question.

  5. .gitignore: A predetermined .gitignore file that tells Git which files or folders to ignore in a Python project.

  6. LICENSE: A file containing an MIT permissive license.

Dataset

In this homework we worked with Kaggle's predefined Citation Network Dataset.

Important Note

If the Notebook doesn't load through Github please try all of these steps:

  1. Try compiling the Notebook through its NBViewer.

  2. Try downloading the Notebook and opening it in your local computer.


Author: Miguel Angel Sanchez Cortes

Email: sanchezcortes.2049495@studenti.uniroma1.it

MSc. in Data Science, Sapienza University of Rome

About

Python project for the Algorithmic Methods of Data Science class for the MSc. in Data Science at the Sapienza University of Rome. The main purpose of the project is exploring Network concepts like Shortest Walk, Min Cut, Densest Subgraph, etc. and building algorithms to explore these concepts.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published