Skip to content

miniproject: viral epidemics and viruses

kareenasingh edited this page Aug 12, 2020 · 30 revisions

Which viruses are responsible for causing viral epidemics?

Owner:

Kareena Singh

Collaborator:

Jitu Ram Bhargav

Miniproject Summary

Objective

  • To achieve the target “Which viruses are reported as being involved in causing viral epidemics?”
  • For better understanding, not all viruses are infectious and may lead to an epidemic or a pandemic for that matter. There are few viruses that have been reported as being involved in a viral epidemic, whereas few are not. The goal is to find out which viruses can cause or have caused an epidemic outbreak.

Methodology

  • To create Dictionary on viruses from scratch. (viruses not builtin ami dictionary)
  • To download a corpus of 1000 articles using getpapers on viruses that cause viral epidemics.
  • To run ami search for the viruses dictionary.
  • To perform Binary Classification of papers using KNIME
  • To do Sectioning of the papers using ami section
  • To Identify and extract entities and display the data.

corpora 🟢 created

  • Initially the communal corpus called epidemic50noCov of 50 articles on viral epidemics will be created.
  • After analyzing the above corpus, we shall later come up with own individual corpus consisting of 950 papers. It shall be created using the virus dictionary.
  • The corpus of 950 articles was created and committed here in 4 parts. https://github.com/petermr/openVirus/tree/master/miniproject/virus

dictionaries

Software/ Tool set required

For Python see https://github.com/petermr/openVirus/wiki/Tools:-Python

Keras is a powerful and easy-to-use free open source Python library for developing and evaluating deep learning models. It wraps the efficient numerical computation libraries Theano and TensorFlow and allows you to define and train neural network models in just a few lines of code.

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more

  • R for summarizing the extracted information. R is a powerful language used widely for data analysis and statistical computing.

Displaying of data

  • For displaying the extracted information, we will use excel for creating spreadsheets and other forms of data display such as histograms, timeline, pie charts and graphical representations. This is called Scoping review.

Progress made:

  1. Analysis of the 50 papers of communal corpus epidemic50noCov and displayed in the form of a spreadsheet. Finished 🟢
  2. Sectioning of the 50 papers using ami section Finished 🟢
  3. Downloaded a corpus of 950 articles on viruses using getpapers Finished 🟢
  4. Sectioning of corpus 950 using ami section. Finished 🟢
  5. Created a test dictionary (link above) with 30 entries on human viruses using a test file containing a list of names of human viruses. Using ami dict . Finished. 🟢
  6. Created Dictionary using Wikidata Query Service and SPARQL. (Finished) 🟢
  7. Run ami search on corpus 950 and recieved cooccurrence. (Finished) 🟢
  8. Committed the corpus 950 on GitHub. (Finished) 🟢
  9. Installation of Jupyter notebook as a machine learning tool. (Finished) 🟢
  10. Manual classification of corpus950 (Ongoing) 🔵

How to commit your corpus950 on Github?

  • Download and Install Github Desktop from here https://desktop.github.com , Log in to your github account and Clone the repository openVirus using URL
  • Remember the folder of your system where you have cloned the repo. Open the folder of your miniproject and move your corpus950 files here. Go to github desktop and you can see the changes committed on your left
  • Add your summary like 'added files' to miniproject and commit to master.
  • Then click on Push changes and you data will be committed. ( It will take time depending on your file size)
  • here https://github.com/petermr/openVirus/tree/master/miniproject/virus

Constraints



INITIAL SUMMARY BY INYAS COLLABORATOR

Why I am working on this project?

  • I am an MSc student and want to pursue PhD in related field.
  • Helpful in understanding the current scenario in viral epidemics.
  • Give an idea about accesssing the online stored data and how to use the stored information for our research.
  • This project will help me in understanding the research methodology by using computational biology and bioinformatics

What is this miniproject about? Purpose and Goals

  • To create and maintain a dictionary for viruses which are responsible for causing viral epidemic.
  • To find the papers and articles that are related to viruses and viral epidemic.
  • To identify the different types of viruses which causing viral epidemic around the globe.
  • To collect updated data from trusted sources which are related to viruses and viral epidemic.

What are the different tools required in this miniproject?

  • getpapersto obtain papers
  • amifor to create and maintain dictionary
  • ami searchuse for testing the dictionary
  • ami sectionuse for a document sectioning
  • amidicttool for creating dictionary

Overview of usage and significance of getpapers ?

  • An overall information of specific work on which an individual is working on.
  • It consists of the work done till date and tells that what will be the possibilities of further research in the topic under limits or beyond limits.

Overview of usage and significance of ami?

  • An editing, analysing platform for the processed documents and papers and also used for uploading data and create dictionaries.

What is the meaning and purpose of a dictionary? How is it created?

  • My dictionary is virus created from wikidata using the software ami.

What is in my corpus 950? How was it created?

  • It consists of 950 articles which are taken from European PubMedCentral with help of getpapers
  • EuPMC is a collection of journals literature and research articles related to life sciences around the globe.

Bugs or issues

  • No bugs or issues faced till now, Hoping that by communicating with the allocated mentor and members of openVirus group the problems can be solved till the completion of my four weeks programme

What have you learnt and understood in this project?

  • I learnt about purpose and usage of getpapers, ami, corpus 950 etc.
  • I understood how to update and edit pages on GitHub.
  • I came to know that from where I can collect the articles.
  • I also understood about how to download and install software from GitHub.


``

Clone this wiki locally