Skip to content

WikiCite Conference 2020

petermr edited this page Oct 22, 2020 · 13 revisions

WikiCite Conference 2020

https://meta.wikimedia.org/wiki/WikiCite/2020_Virtual_conference

Invitation

We are invited to speak at WikiCite Virtual Conference on October 28th. Due to the pandemic, this year the conference will be held completely online and the format is a good opportunity for sharing with the global community the ongoing projects around WikiCite.

Célio and Érica are organizing a session at the conference focused on data modelling projects, bibliography and library data in which they share ongoing projects. Our team is one of the invitees to the session.

When?

  • The talk will take place on October 28th, 17:30 IST (PLEASE CHECK!! == 12.00 UTC == 1300 BST == 0900 Brazil)
  • 60 minutes(50 min. + 10 min. discussion)

Who is our Audience?

English isn't their primary language for most of the audiences.

@PMR, please update this section. We imagine that most are primarily Spanish (es) or (PMR: Don't worry. we don't know where in the program this goes. I think your/Ambreen presentations will be very well understood. For those on poorer connections it may be harder to pick out words so good simple slides matter).

Division of the Talk:

The talk will be divided into the following parts:

  • Introduction by PMR (This will be extended)

  • A two-minute presentation for each mini-projects(The importance of your facet, how you created dictionaries, Interesting results)

  • Brief demo of creating dictionaries, downloading papers, ami search (all in one Jupyter Notebook). Takes roughly two minutes

  • Division of the talk

THIS HAS CHANGED

(This is just a crude outline, and of course, needs more refinement. Everybody, please add your ideas.)

How to make it more interactive

  • Live demonstrations are much better than slides.
  • => a video with captions could also be better to avoid any interruption in live demonstration (like network issues).

Experiences

Everybody, please record all your experiences here:

1 The importance of your facet: 2. How you created your dictionary 3. Your dashboard and cooccurrence 4. Multilinguality 5. Links 6. YOUR Ideas

Everybody, please feel free add more thoughts and ideas to this page.

Dictionary : Disease

Importance of dictionary : (need to add)

Creation of dictionary : https://github.com/petermr/openVirus/tree/master/dictionaries/diseases/disease_dict.md

Results (dashboard and cooccurrence) : (need to add)

Multilinguality :

  • From SPARQL (Wikidata Query Service) mutlilinguality became possible in a very easy manner. The valid disease dictionary consists of, so far, 4 Indic Languages (will soon add Spanish and Portuguese).

Links : The gradually developed dictionaries with their creation and improvements is at the wiki page

Clean data for demo

IMPORTANT Please can all project owners update their:

  1. raw minicorpus (+ DOCUMENTATION - where did the corpus come from?)
  2. "cleaned" dictionary (as far as possible) + DOCUMENTATION. Validate with the XSD schema using an online service.
  3. multilingual dictionary (will merge with 1 when valid)
  4. searched corpus. This may be the same articles as 3 but will have the results etc.

Do we have a manager for demos?

New outline of session

PMR: this adds some Wikimedia stuff to this...

Background: ContentMine was funded by Wikimedia to develop two prototypes:

wikifactmine (2017)

https://www.wikidata.org/wiki/Wikidata:WikiFactMine "to add referenced scientific facts to Wikidata. It carries out searches of the scientific literature, using search terms divided up into sections called dictionaries. " Among the outputs were a Wikidata SPARQL query system to create dictionaries and extraction and annotation of bibliography for scientific publications.

science source

https://meta.wikimedia.org/wiki/Grants:Project/ContentMine/ScienceSource “Improve biomedical content within Wikimedia, by building an algorithmic version of the medical references guideline." This developed approaches for filtering and classifying articles and emphasized the role of "main subject"

OpenVirus

history

  • Open Climate Knowledge => Open Virus (NB what was approximate start date? Priya, Kareena?)
  • NIPGR interns
  • INYAS interns (Dheeraj still active member!)
  • CambioHackathon
  • recent link up with Redalyc and Scholia

vision

  • to create Wikidata-enhanced multilingual dictionaries for any facet of science relevant to viral epidemics
  • to retrieve, normalise and section a broad range of OA articles on viral epidemics
  • to annotate these with selected dictionaries and extract useful patterns and insights

october 2020

  • 8 facets, each with minicorpus and dictionary
  • Jupyter notebooks to drive these projects.
  • exploration of ES-language articles with Redalyc.

REVIEW OF SELECTED FACETS AND RESULTS

Clone this wiki locally