Tasks, People and Roles

This is a volunteer project, where most people have "day activities" which take precedence. It is understood that contributors give what they can, when they can and this project and the world are very grateful.

Timescales are unpredictable. We are aiming to duplicate our experience with volunteer-based projects like https://en.wikipedia.org/wiki/Blue_Obelisk.

Vision

To create new Open scientific/medical knowledge relevant to the COVID-19 pandemic that will be used by others for:

policy
other science (especially data-driven)
education and learning for everyone

With limited, but smart, resources our main impact will be:

using knowledge from a wide range of disciplines, not just biomedical. Social science, psychology, economics, literature, etc. as well as maths, phys, chemistry, materials, biomedical.
welcoming collaborators from round the world
giving readers power to define the knowledge they want (e.g. creating dictionaries)
helping citizens demand that all scholarly research be free to everyone

How we work

We have common goals:

to download and index the whole of the world's relevant literature.
to build crawlers and readers that make this trivial for users
to transform legacy documents (PDF) into structured XHTML
to build a distributed dictionary system that covers all relevant subjects
to search and extract snippets of text (and other knowledge) that can be indexed and aggregated.
to enhance the reading of the literature through annotation.
to build reusable resources for search, education, machines
to interface with other communities (Wikimedia, R, Jupyter, etc.)

We have managed to identify important areas which are self-contained and continuous (i.e. every contribution matters and enhances the project, but where none are blocking. This is mainly because we have multiple inputs (e.g. corpora, sites), multiple dictionaries (knowledge facets), and multiple outputs, and multiple distribution routes. There are general tasks (documentation, tutorials, outreach) which are continuous. There is no blame, and we do not rely too heavily on other colleagues' contributions.

Current Tasks and People

This section is highly mutable!

Workflow

The workflow is basically:

inputs crawl and read sources, and normalize them
transform search them with dictionaries, possibly including transforms
outputs display the results

These are largely separable as information flows downwards and at each stage is captured in filestore. That means that a developer "only" needs to be able to read from a standard file type, and output to another filetype.

Infrastructure: RP, PMR

Inputs

EuropePMC, our main workhorse (works): no one
biorxiv and medrxiv (prototype): PMR
theses: AJ
journals / scrapers: LH
DOAJ abstracts: CD

Transforms and searches

AMI: RP , PMR
Solr: CD, AJ

Outputs

R: TS
display: CD & TS

Dictionaries

creation , maintenance, documentation: RL, PMR
wikimedia: TS

Testing

Everyone :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tasks, People and Roles

Tasks, People and Roles

Vision

How we work

Current Tasks and People

Workflow

Inputs

Transforms and searches

Outputs

Dictionaries

Testing

Clone this wiki locally