Skip to content

The objective of this challenge is to develop a machine learning model that classifies statements and questions expressed by university students in Kenya when speaking about the mental health challenges they struggle with. The four categories are depression, suicide, alchoholism, and drug abuse.

Notifications You must be signed in to change notification settings

NazarioR9/BNBR_Challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Basic Needs Basic Rights Kenya - Tech4MentalHealth

Brief Description

The objective of this challenge is to develop a machine learning model that classifies statements and questions expressed by university students in Kenya when speaking about the mental health challenges they struggle with. The four categories are depression, suicide, alchoholism, and drug abuse.
For more information about this challenge, have a look on Zindi.

Repo Structure

|----bnbr (package)
|       |--- . . .
|       |--- {module}.py
|       |--- . . .
|
|----data (placeholder for raw and preprocessed data)
|       |--- Train.csv
|       |--- Test.csv
|       |--- SampleSubmission.csv
|       |--- . . .
|
|----mask_language_modeling
|       |--- MLM_BertBase_finetuning.ipynb
|       |--- MLM_RobertaBase_finetuning.ipynb
|
|----notebooks
|       |--- BNBR_PreProcessing.ipynb.ipynb
|       |--- TranslateWithTransformers.ipynb
|       |--- MLMRobertaBaseGenericModel.ipynb
|       |--- MLMBertBaseGenericModel.ipynb
|       |--- Blend_roberta_roberta_bert.ipynb
|
|----translated
|       |--- TRANSLATE_THE_DATA.ipynb
|       |--- second_roberta.ipynb
|
|---- Readme.md

PS: This isn't the definitive structure. During the code execution, new directories will be created.

How to run the code

Steps

# 1. Make sure to follow the repo structure
# 2. Run 'notebooks/BNBR_PreProcessing.ipynb'
# 3. Run 'translated/TRANSLATE_THE_DATA.ipynb'
# 4. Run 'mask_language_modeling/MLM_BertBase_finetuning.ipynb' and 'mask_language_modeling/MLM_RobertaBase_finetuning.ipynb'
# 5. Run 'notebooks/MLMBertBaseGenericModel.ipynb', 'notebooks/MLMRobertaBaseGenericModel.ipynb', 'translated/second_roberta.ipynb'
# 5. Run 'notebooks/Blend_roberta_roberta_bert.ipynb'

Expectations

To make sure that everything is working smoothly, here is what to expect from above (steps):

# 1. 
# 2. After this step, verify that 'data/{final_train, final_test}.csv' exist
# 3. After this step, verify that 'data/{extended_train_from_fr_to_english, extended_test_from_fr_to_english}.csv' exist
# 4. Here, a new directory 'mlm_finetuned_models/' will appear (in the repo structure) and should contains '{mlm_bert_base_, mlm_roberta_base_}.zip'
# 5. Directory 'submissions/' will be added to the repo structure and '{bert-base-uncased__, roberta-base__, roberta-base_translated}.csv' will be written in it.
# 5. Performs a simple weight-blend, then creates 'submissions/final_submission.csv' which is the final submission file.

Look for the team named : OptimusPrime
Rank : 6/501

Authors

Name Zindi ID Github ID
Muhamed TUO @Muhamed_Tuo @NazarioR9
Darius MORURI @Brainiac @DariusTheGeek
Azer KSOURI @plndz @Az-Ks

About

The objective of this challenge is to develop a machine learning model that classifies statements and questions expressed by university students in Kenya when speaking about the mental health challenges they struggle with. The four categories are depression, suicide, alchoholism, and drug abuse.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published