Skip to content

10-Academy-quad-squad/casual-graph

Repository files navigation

casual-graph

Table of Contents

Overview

  • A common frustration in the industry, especially when it comes to getting business insights from tabular data, is that the most interesting questions (from their perspective) are often not answerable with observational data alone. These questions can be similar to:

    • “What will happen if I halve the price of my product?”
    • “Which clients will pay their debts only if I call them?”
  • The causal graph is a central object in the framework mentioned above, but it is often unknown, subject to personal knowledge and bias, or loosely connected to the available data. The main objective of this task is to highlight the importance of the matter in a concrete way.

  • In this project we applied a casual graph model on a breast cancer dataset using a library called CausalNex in order to learn about the cause and effect structure behind the diagnosis of breast cancer.

Project Structure

The repository has a number of files including python scripts, jupyter notebooks, pdfs and text files. Here is their structure with a brief explanation.

Data

  • We are using the Breast Cancer Wisconsin (Diagnostic) Data Set extracted from kaggle

notebooks

  • EDA.ipynb: a jupyter notebook for exploratory data analysis

scripts

tests:

  • the folder containing unit tests for components in the scripts

logs:

  • the folder containing log files (if it doesn't exist it will be created once logging starts)