Skip to content

SasCezar/MasterThesis

Repository files navigation

MasterThesis

Although the vast majority of knowledge bases (KBs) are heavily biased towards English, Wikipedias do cover very different topics in different languages. Exploiting this, we introduce a new multilingual dataset (X-WikiRE), framing relation extraction as a multilingual machine reading problem. We show that by leveraging this resource it is possible to robustly transfer models cross-lingually and that multilingual support significantly improves (zero-shot) relation extraction, enabling the population of low-resourced KBs from their well-populated counterparts.

Read the full thesis from the MasterThesis.pdf file.

Check: X-WikiRE repository for the code on how to create the dataset.

Work done while visiting CoAStaL Lab @ the University of Copenhagen.

Cite

@inproceedings{abdou-etal-2019-x,
    title = "X-{W}iki{RE}: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension",
    author = "Abdou, Mostafa  and
      Sas, Cezar  and
      Aralikatte, Rahul  and
      Augenstein, Isabelle  and
      S{\o}gaard, Anders",
    booktitle = "Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-6130",
    doi = "10.18653/v1/D19-6130",
    pages = "265--274",
    abstract = "Although the vast majority of knowledge bases (KBs) are heavily biased towards English, Wikipedias do cover very different topics in different languages. Exploiting this, we introduce a new multilingual dataset (X-WikiRE), framing relation extraction as a multilingual machine reading problem. We show that by leveraging this resource it is possible to robustly transfer models cross-lingually and that multilingual support significantly improves (zero-shot) relation extraction, enabling the population of low-resourced KBs from their well-populated counterparts.",
}