Skip to content

mrsaraei/Covid19_Data_Analysis

Repository files navigation

Covid19_Data_Analysis

Machine Learning for COVID-19 Data Analysis Project

  • Researcher: Mohammadreza Saraei1 🇮🇷
  • Principal Investigator: Dr. Saman Rajebi (Website) 🇮🇷
  • Advisor: Dr. Sebelan Danishvar (Website) 🇬🇧

COVID-19 Data Acquisition

Clinical Diagnostic Tasks

  • ✅ Task 1: The COVID-19 Self-Assessment
  • ✅ Task 2: The The COVID-19 Screening
  • ✅ Task 3: The The COVID-19 Detection
  • ✅ Task 4: The COVID-19 Severity Assessment

Technical Implementation Tasks

  • Step 1: Data Acquisition for Potential COVID-19 Cases (n=~2500)
  • Step 2: Data Encoding for CSV-based Numerical & Categorial Values
  • Step 3: Data Fusion for COVID-19 Diagnostic Tasks (Early-Fusion Type I)
  • Step 4: Data Annotation for COVID-19 Diagnostic Tasks (Based on WHO Guidelines v2020-2021)
  • Step 5: Data Preprocessing for COVID-19 Diagnostic Tasks (e.g., Data Normalization, Data Cleaning and Data Balancing)
  • Step 6: Feature Selection for Comparative Machine Learning Models Improvement (e.g., SelectKBest, Correlation Heatmap, LassoCV, and Extra Tree Classifier)
  • Step 7: Developing Comparative Machine Learning Models for Training COVID-19 Diagnostic Tasks (n=9)
  • Step 8: Evaluation of Comparative Machine Learning Models for COVID-19 Diagnostic Tasks Validation
  • Step 9: Visualization of Comparative Machine Learning Models Outputs (e.g., ROC Curve, Precision-Recall Curve, Confusion Matrix, and Learning Rate)

A Different Traditional Approach for Automatic Comparative Machine Learning in Multimodality Covid-19 Severity Recognition

Note

Saraei, M., Rahmani, S., Rajebi, S., Danishvar, S. (2023). Int. J. Innov. Eng., 3(1), 1-12. doi: https://doi.org/10.59615/ijie.3.1.1

A global pandemic of a novel coronavirus disease, COVID-19, was declared by the World Health Organization in March 2020. This viral outbreak, originating in Wuhan, China in 2019, has spurred the development of novel AI-powered diagnostic approaches for COVID-19. However, these tools are often hampered by significant false-negative rates, potentially jeopardizing patient outcomes. To address this critical challenge, we present a study on a COVID-19 severity recognition model that leverages a combination of 2,500 multimodal data points and an Early Fusion Type-I (EFT1) architecture. Two automated systems were implemented to facilitate machine learning algorithm comparison and feature selection. The Descended Composite Scores Average (DCSA) method served as the performance evaluation metric. The Extreme Gradient Boost algorithm achieved the highest DCSA score (0.998) within the AutoCML system, while the Random Forest algorithm outperformed others in the AutoIFSCML system (DCSA score: 0.960). These findings demonstrate the potential of fine-tuned machine-learning models to enhance diagnostic accuracy in COVID-19. Additionally, the study underscores the efficacy of ensemble learning in boosting the performance of traditional models.

AutoCML AutoIFSCML Descended Composite Scores Average

Footnotes

  1. Please feel free to if you have any questions:e-mail: mrsaraei3@gmail.com