Skip to content
This repository has been archived by the owner on Dec 2, 2023. It is now read-only.

This repository holds an implementation of a binary classifier for the BreakHis image dataset!

License

Notifications You must be signed in to change notification settings

Igor03/breast-cancer-classifier

Repository files navigation

Introduction

Breast cancer is one of the most common forms of the disease in the world. It is estimated that about 8% of the female population will be affected at some point in their lives by this pathology. Despite all advances in the treatment of this disease, early diagnosis remains essential for the effectiveness of any applied technique. Allied to this purpose, several techniques and algorithms have been developed in the form of CADx (Computer Aided Diagnosis) systems, whose main objective is to help in the process of diagnosing the disease. In this context, the present work aims to present a methodology based on Deep Learning Features and Support Vector Machines for automatic classification of breast lesions in images of histopathological exams. When applied to the BreakHis image base, the methodology proved to be promising, with an accuracy of 97.3%.

Breakhis dataset

To validate the presented methodology, were used the BreakHis image dataset. This dataset contains 7909 images of histopathological exams divided by magnification. The magnifications are 40X, 100X, 200X and 400X. The table bellow details this division.

40X 100X 200X 400X
Benign 625 631 636 588
Malignant 1370 1437 1390 1232

Running

  • Define the path of the dataset in the settings.py file.;
  • The dataset needs to be organized following the bellow architecture. We provide functions to help with this task;
    dataset\
       40x\
           benign\
           malignant\
       100x\
           benign\
           malignant\
       200x\
           benign\
           malignant\
       400X\
          benign\
          malignant\
       
    
  • Run the main.py file in order to start the process of feature extraction and classification;

Paper

We also produced a paper that has a more detailed explanation of the approach adopted in this methodology. There we present related works, detailed results and future enhancements for this work.

Results

The following table shows the results achieved by the presented methodology

results

This next image is a boxplot detailing the accuracy results for all the architectures used to extract features.

boxplot

Environment

Those are all the languages and tools used in the project

  • Python 3.11
  • PyCharm
  • LaTeX
  • Overleaf

Important

  • If you plan on running this project, we highly recommend to use the Python version showed in the previous section. We're also providing a requirements file with all the required dependencies.
  • In case you're having a hard time trying to understand or run anything in this repo, feel free to contact me.

About

This repository holds an implementation of a binary classifier for the BreakHis image dataset!

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages