CoughVid- Covid19 Detection from Cough Voice Samples

Project Objective:

Project is created with intection to detect/classify an audio signal if it is such as a cough or sneeze audio signal.
Further goal is to pipeline this to mobile applications to narrow the detection of sickness audio specificlly of a COVID19.
To contribute to help gov authorities to identify the persons with probable coronavirus infection living among us. ("We should fight the Virus, not the Patient effected with virus")

Challenge

Develop machine learning models for detection of sickness sounds (coughing and sneezing) for Covid-19

Data Description

Dataset Source: Link

Motivation:

This dataset has been created for the Pfizer Digital Medicine Challenge.

Early detection of respiratory tract infections can lead to timely diagnosis and treatment, which can result in better outcomes and reduce the likelihood of severe complications.
Respiratory sounds carry rich information that can be mined to develop automated approaches for detection of sickness behaviors like coughing and sneezing.
In this challenge, we invite you to build machine learning models for automatic detection of sickness sounds by using audio recordings from open datasets.
The dataset was created using audio files from ESC-50 and AudioSet.
We used the open source BMAT Annotation Tool to annotate this dataset.

Dataset

The dataset is organized as follows:

train

sick (n=1435)
not_sick (n=2283)

validation

sick (n=468)
not_sick (n=753)

test

sick (n=642)
not_sick (n=1012)

COVID 19 Cough Audio- 49 year old Male in UK

Patient Details:

Age: 49
Sex: Male
Country: UK
Day: 5
Resource Date: Mar 23, 2020
Infection Symptoms: cannot Breathe, Heavy Coughs.
Health Status before effected by COVID'19: Healthy Person, Regular Swimmer

Preprocessing

Data is Cleaned and Following is the class distribution:

The above analysis explains that the dataset of both classes in the training folder is equally distributed in the length.

The MFCC Feature Extraction is applied to every training sample to get 13x99 features/coefficients. This is the method used to convert the audio data into numpy arrays

Model Building

Training Analysis and Conclusion

It is understood that the MFCC and Spectrograms of the audio signals can also be used as image dataset and build CNN Models to classify the audio samples.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Dataset		Dataset
Notebooks Train(All Parts)		Notebooks Train(All Parts)
__pycache__		__pycache__
content		content
models		models
pickles		pickles
.DS_Store		.DS_Store
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Project_Description.md		Project_Description.md
README.md		README.md
predictions.csv		predictions.csv

License

AdiNarendra98/CoughVid-Covid19-Detection-from-Cough-Samples

Folders and files

Latest commit

History

Repository files navigation

CoughVid- Covid19 Detection from Cough Voice Samples

Project Objective:

Challenge

Data Description

Dataset

Preprocessing

Model Building

Training Analysis and Conclusion

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages