Skip to content

miniproject: viral epidemics and zoonoses

Sana Saifi edited this page Sep 16, 2020 · 26 revisions

Which Viral Zoonoses lead to Viral Epidemic?

Webp net-resizeimage

Owner

SANA SAIFI

Collaborator

_ 

Background !

Zoonoses are diseases transmissible from animals, to Humans. Both new and old viral zoonoses are important in emerging and reemerging virus diseases leading to a epidemic. Scientists estimate that more than 6 out of every 10 known infectious diseases in people can be spread from animals , and 3 out of every 4 new or emerging infectious diseases in people come from animals.

Viruses from wildlife hosts have caused such emerging high-impact diseases as severe acute respiratory syndrome (SARS), Ebola fever, and influenza in humans.

Mini Project Summary

smiley_2_90_2_60x50 OBJECTIVE

This Mini Project is set to find, How and which zoonotic diseases lead to the Viral Epidemic.

smiley_2_90_2_60x50 METHODOLOGY

  • Using the communal corpus Viral Epidemic 50 articles were downloaded using get papers.🟩FINISHED

  • Binary Classification of the 50 articles into True Positives/ False Positivesi.e, the articles are based on Viral Epidemics or not.🟩FINISHED

  • Using ami search to find whether the articles mentioned any comorbidity in a viral epidemic or not, annotating with dictionaries to create ami DataTables.🟩FINISHED

  • Sectioning the articles using ami section to split a document in aCtreeinto sections. Based on tags from JATS, etc.🟩FINISHED

  • Re-run the query to get a corpus of 950 articles on the _ Viral Epidemics and Zoonoses_.🟩FINISHED

  • Scrutinizing the 950 articles for true positives and false positives and creating a spreadsheet.🟨 STARTED

  • Using ami search to create DataTables and ami section for sectioning the 950 articles.🟩FINISHED

  • Create a dictionary, specifically related to the Mini Project.🟩FINISHED

  • Sectioning the papers on the basis of the diseases related to animals.🟪IN PROGRESS

  • Use relevant machine learning techniques for the classification of data based on whether the papers are related to viral epidemics and the which Viral Zoonotic Disease were reported.🟨 STARTED

  • Displaying of results using R / KNIME. 🟥 NOT STARTED

smiley_2_90_2_60x50 PROGRESS

◾ Spreadsheet of 50 articles classified into the subcategories of viruses, funders, countries, year of publish, testing and tracing, and type of paper.🟩FINISHED

◾ Sectioning of the 950 papers using ami section 🟩FINISHED

◾ Downloaded a corpus of 950 articles on viral epidemics and zoonoses using getpapers🟩FINISHED

◾ Created a dictionary with 135 entries on zoonotic disease using ami dict.🟩FINISHED

◾ Created a Dictionary using Wikidata Query Service and SPARQL.🟩FINISHED

◾ Run ami search on corpus 950. 🟩FINISHED

◾ Release corpus 950 using Github desktop. 🟩FINISHED

◾ Installation of Anaconda for installing various tools i.e., Jupyter. 🟩FINISHED

Corpora

◾ Initially the communal corpus of 50 articles on viral epidemics.

                getpapers -q viral epidemics -k 950 -o viral epidemics -x -p

◾ Next, a new corpus of 950 articles using the Dictionary Zoonoses.

◾ Downloaded the corpus of 950 articles using getpapers with the syntax:

                getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p

◾ This corpora was classified, searched and sectioned.

How to Commit to GITHUB?

There are three methods to upload the corpus.

  1. Through VISUAL CODE STUDIO.

See @Ambreen's Page for the instructions

  1. Through COMMAND PROMPT

pre-required: openVirus repository in pc. if not clone it from the following syntax.

                                         git clone https://github.com/petermr/openVirus.git

then follow these command lines.

C:\Users\admin>cd openVirus

C:\Users\admin\openVirus> cd miniproject

C:\Users\admin\openVirus\miniproject> cd zoonoses

C:\Users\admin\openVirus\miniproject\zoonoses>git status

C:\Users\admin\openVirus\miniproject\zoonoses>dir

C:\Users\admin\openVirus\miniproject\zoonoses>git add .

C:\Users\admin\openVirus\miniproject\zoonoses>git status

C:\Users\admin\openVirus\miniproject\zoonoses>git commit -am "first commit all corpus"

C:\Users\admin\openVirus\miniproject\zoonoses>git pull

files will start getting upload.

C:\Users\admin\openVirus\miniproject\zoonoses>git push

will ask for username and password. after entering the same your file will get committed under your name.

  1. Through Github Desktop

pre-required: Github Desktop (install from here and cloned openVirus Repository.

  • Open the folder where we cloned the repository. Open your files in CProject.
  • Copy the files and Paste to the folder in openVirus repository(remote repository) where we want to commit the files.
  • Open the Github desktop.
  • Go to 'File', then 'Add Local Repository'.
  • Now, choose the openVirus repository from your system.
  • Add a commit message and go to 'Commit to master'.
  • After committing, go to 'Push to origin'.
  • After completion of pushing the repository, your uploaded files can be viewed on the Github repository.

Dictionary

  • How to create dictionary?

(https://github.com/petermr/openVirus/wiki/Dictionary:-Zoonosis#how-i-created-)

  1. The Test Dictionary created using amidict was done manually and lacked synonyms, host, variable name, description, wikidata links, wikipedia links and etc.

  2. The Dictionary created using SPARQL had descriptions, links, some synonyms, labels and ids. However, the rendered results were _Scientific Articles and Journals _.This need refining as we want the ids which is on Zoonotic diseases/viruses.

As PMR suggested this zoonotic disease dictionary has to be done manually.

Link for the manually Made Dictionary - https://github.com/petermr/openVirus/blob/master/dictionaries/zoonoses/zoonosis.xml

Software Used:

  • nodejs nvm for installing get papers
  • getpapers for retrieving 950 articles from EuPMC
  • AMI for sectioning and searching.
  • SPARQL and amidict for creating dictionaries.
  • KNIME for displaying results.

smiley_2_90_2_60x50AMI SECTIONING :

Sectioning of the dataset is usually done for greater precision.

  1. Downloaded the corpus of 950 papers using getpapers in XML, PDF and JSON file.

                     getpapers -q "Zoonoses in Viral epidemics" -k 950 -o viral epidemics -x -p
    
  2. To easy the process, made 5 subfolders of 200 corpus.

  3. To divide the content of papers into sections of front, body, back and float groups, again open the Command Prompt and give the syntax:

                          ami -p <name of directory> section
    
  4. This will create a subfolder of sections in each folder of the scientific paper which is there in your directory.

smiley_2_90_2_60x50AMI SEARCH

  1. Downloaded the corpus of 950 papers using the above same syntax in XML, PDF and JSON file.

  2. To search the dictionary of country drugs funders diseases, open the command prompt and give syntax:

                    ami -p <name of directory> search --dictionary country drugs funders diseases 
    
  3. Open the directory and at the end of folder you will find various HTML Document.

smiley_2_90_2_60x50 AMI VALIDATION

Open command prompt and type :

cd ami3

git pull

mvn clean install -Dmaven.test.skip=true

Wait! ... BUILD SUCCESS!


#f0b215 NOT STARTED: KNIME, Keras, R

#c5f015 STARTED : dictionary

#f0b215 BLOCKED : .

#1589F0 FINISHED : downloading and installing get papers, manual classification, list of zoonotic diseases, installing ami, getpapers, maven, jdk, sectioning of corpus950, ami search of corpus 950.


Clone this wiki locally