Skip to content
@dsfsi

Data Science for Social Impact Research Group @ University of Pretoria

We are the Data Science for Social Impact research group at the Computer Science Department, University of Pretoria.

We are the Data Science for Social Impact research group at the Computer Science Department, University of Pretoria.

Our general areas of work straddle Data Science for Society as well as Local Language Natural Language Processing. These two strands are complementary. Our work in Data Science and Society has allowed us to have a more nuanced approach to understanding the systematic challenges that face being able to do excellent science with local languages. Through Data Science for Society, we have to understand how when one carries through Data Science research, we situate how the users are part of the process. We find that we need to adjust our research to take care of these challenges and innovate in ways we gather direct data or alternative data.

For us, Data Science for Society means being able to improve approaches/methods or scientific tools for DS while enhancing the ways decision-makers can use the insights that come from these tools. Local Language Natural Language Processing is focused on ways to develop new tools, new data and methodology to improve the state of African languages.

DSFSI Vision, Mission and Values.

Vision

To be a leading inclusive lab that creates and harnesses data and multidisciplinary scientific exploration for societal impact.

Mission

Data-driven collaborative innovation to empower society to tackle challenges and preserve our languages.

Values

  • Community and Collaboration
  • Shared responsibility
  • Inclusiveness
  • Integrity and openness
  • Agency
  • Generosity

Pinned

  1. PuoBERTa PuoBERTa Public

    A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.

    Makefile 4

  2. vukuzenzele-nlp vukuzenzele-nlp Public

    Forked from dsfsi/dsfsi-dataset-template

    The dataset contains editions from the South African government magazine Vuk'uzenzele. Data was scraped from PDFs that have been placed in the data/raw folder. The PDFS were obtained from the Vuk'u…

    Jupyter Notebook 6 4

  3. textaugment textaugment Public

    TextAugment: Text Augmentation Library

    Python 372 61

  4. covid19za covid19za Public

    Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa

    Jupyter Notebook 255 200

  5. gov-za-multilingual gov-za-multilingual Public

    The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements

    Jupyter Notebook 3

  6. masakhane-web masakhane-web Public

    Masakhane Web is a translation web application for solely African Languages.

    Jupyter Notebook 35 15

Repositories

Showing 10 of 43 repositories

Top languages

Loading…

Most used topics

Loading…