Skip to content
petermr edited this page Jun 18, 2020 · 1 revision

miniprojects

each intern has their own project. The current ones are:

  • countries (Ambreen) "what countries are viral epidemics reported in?" Possible outcome is a spreadsheet, or world map.
  • diseases (Priya) "what diseases co-occur with epidemics?" Not necessarily causation
  • viruses (Kareena) "what are the main viruses causing epidemics?"
  • drugs (Rajan) "what drugs are used during epidemics?" some may be antiviral, or palliative, or antibacterial against secondary infections.
  • funders (Vaishali). "Which funders support research on viral epidemics?"

All projects have an element of machine classification ("learning") and natural language processing (NLP). The main uses are:

  • is this paper really/mainly about viral epidemics?
  • does your concept (above) co-occur in the same sentence as the virus/disease - i.e. is it tightly coupled? For example is "India" related to "virus in India" or is it unrelated (e.g. the reagent came from an Indian supplier?) (edited)

The main packages will be

  • ami for sectioning in CProjects and dictionary searching.
  • KNIME for workflow and analytical tools
  • R for workflow and analytical tools
  • Keras for machine learning
  • Jupyter for logging and reusable scripts

You will use whatever you are most comfortable with. we are not forcing one-size-fits-all. However there will be a need for converters. If you are ingesting from the CProject into (say) R or Keras or KNIME let us know now as I may need to write exporters. Structured formats such as XML or JSON are valuable as often the consuming tool can use XPath or similar to ingest the bits they want. @clyde davies

WIKIDATA I have struggled with lookup because there isn't a simple API (I may be out of date?). It used to work. I think the interface has changed. I got blocked in a mixture of lazy loading and Mojibake.

USE THE WIKI! We should use the Wiki (or Github pages) for almost all project/software support. Email is really awful, Slack is not appropriate. It is difficult to find anything over a few days old and there's no context. (I shall copy this Slack to the Wiki).

  • every project should have a wiki page.
  • every software should have a wiki page
  • techniques should have a wiki page.
Clone this wiki locally