Skip to content

teamUBUNTU/Doc-Assist-BOT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Doctor Assistant Bot

A disease prediction bot that provides assistance to doctors for better and faster diagnostics.

Augmenting clinical practice with data and AI. We try to empower the doctor with ever-growing medical literature, statistics in time bound situations. The system learns over time, post deployment.

What it does Doc Assit Bot is a chatbot that listens to a patient's symptoms and predicts the most probable infection or disease, hence assisting in diagnosing a patient.

Framework:

  • Backend: Python + Flask
  • Frontend: Bootstrap (Barebones, for demo only)
  • Machine Learning Model : RandomForest and Naive Bayes from Sklearn for the disease prediction. The model has been pretrained on a dataset of 4920 trials with 132 symptoms and 41 diseases. NLTK was used for pre-processing of the input description.
  • Dataset: https://www.kaggle.com/rabisingh/symptom-checker https://www.kaggle.com/sulianova/cardiovascular-disease-dataset

    How to run:

  • Have a look at our Colab notebook. Everything is clearly shown there:https://colab.research.google.com/drive/1iy0yRoLSxwoYgDCXyvx9CzolZY9YzbWq?usp=sharing

  • Objective:

    Ever increasing patient load and decreasing patient doctor interaction duration often results in misdiagnosis. Apart from wrong diagnosis, a constant learning AI based assistant can also help general practitioners and junior doctors to derive insights from latest clinical advancements, treatment protocols and medical literature in real time.

    The problem it solves:

    A doctor whether experienced or fresh is sufficiently equipped to utilize the latest information and development in medical sciences in his practice. Moreover, during emergency situations, there is substantial pressure and the chances of missing out a point of view are more pronounced. At the same time there limit to the amount of information a doctor can remember and recollect in real scenarios, Much experience and time are required to get good at this.

    Also, the disease and human response are biased by the demographics, age group, etc. Per patient time available to a doctor is reducing due to a high caseload. Often the vast experience itself of a clinician may result in bias while making a diagnosis towards a more prevalent disorder or pathology. Per patient time available to a doctor is reducing due to a high caseload. Often the vast experience itself of a clinician may result in bias while making a diagnosis towards a more prevalent disorder or pathology.

    We do not address the patients, but doctors in order to help the expert and not encouraging self-medication. It's scalable as well, on one end the information repositories may keep increasing which can be directly utilized by the system for dispensing the same information to the doctor as well. The Covid19 pandemic has made it realize that there is a lack of technology, which allows doctors to access the information from literature and statistics quickly enough to be integrated into daily practice.

    INSTRUCTIONS TO RUN

  • install virtual venu and pip with python3.6 using virtualenv -p /usr/bin/python3 venv
  • install https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/roberta-base-nli-stsb-mean-tokens.zip and unzip the contents into a 'model' folder and put it in the same path as app.py
  • download cardio.sav from here https://drive.google.com/file/d/1Yx59_Tt58QAg3JSU8572jA0Qnh7dmpse/view?usp=sharing and put it in the same path as app.py
  • source venv/bin/activate and install the following dependencies
  • pip3 install flask
  • pip3 install nltk
  • pip3 install pandas
  • pip3 install sklearn
  • pip3 install sentence_transformers
  • pip3 install pyrebase
  • run python3 app.py
  • to exit out of the virtual env type deactivate

    Dataset to add

  • http://disnet.ctb.upm.es/
  • https://data.euro.who.int/cisid/
  • https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/DDB/stats.html
  • https://www.ctsi.umn.edu/researcher-resources/clinical-data-repository
  • https://disease-info-api.herokuapp.com/diseases
  • https://github.com/devcenter-square/disease-info
  • http://disnet.ctb.upm.es/visualization/diseases-by-symptom?symptoms=Fever+%7C+
  • https://github.com/deshanadesai/Symptom-X-/blob/master/dataset_clean1.csv
  • https://www.kaggle.com/plarmuseau/sdsort?select=sym_3.csv
  • https://www.kaggle.com/sulianova/cardiovascular-disease-dataset
  • https://www.kaggle.com/pitt/contagious-diseases
  • https://www.kaggle.com/uciml/pima-indians-diabetes-database
  • https://www.kaggle.com/ruslankl/early-biomarkers-of-parkinsons-disease
  • https://www.kaggle.com/flaredown/flaredown-autoimmune-symptom-tracker
  • https://www.malacards.org/
  • https://github.com/topics/biological-expression-language
  • https://github.com/jgpavez/MedicalDiagnosis