Skip to content

GSoC Outreachy 2022 Ideas

Antonin Delpeuch edited this page Feb 3, 2023 · 4 revisions

Here is a list of projects which were proposed for OpenRefine's participation in Outreachy in 2022.

Implement a SPARQL importer

  • Difficulty: easy
  • Description: SPARQL is a query language that can be used to get tabular data out of a triple store. It would be great if users could directly create a project from a SPARQL query to a given SPARQL endpoint without downloading the query results themselves, similarly to the existing SQL integration. There could be some interplay with reconciliation (the importer could create reconciled values directly).
  • Expected outcomes: a new importer would be added, either in the core software or as an extension
  • Skills required/preferred: this will require both backend (Java) and frontend (HTML/CSS/JS) work. Familiarity with RDF and SPARQL would also help.
  • Possible mentors: @wetneb
  • Relevant issues: #1212 and some Documentation links discussion

Server-side localization

  • Difficulty: easy
  • Description: We are currently only able to translate messages generated in our web frontend. Some messages shown to the user are generated in the backend, and those are always in English, without the possibility to translate them. This project would introduce a mechanism to localize the backend too.
  • Expected outcomes: a new mechanism would be introduced to easily generate translatable messages from the backend. The translation files would be exposed to Weblate, so that our existing community of translators can also translate the backend.
  • Skills required/preferred: this will primarily require work on the backend (Java) but some light frontend work might be required too. Familiarity with localization best practices could help.
  • Possible mentors: @wetneb
  • Relevant issues: #2443

User-defined clustering

  • Difficulty: medium
  • Description: Our binning clusterers let the user choose between various methods to generate bins in which the values are spread. Extensions can define new binning methods, but writing an extension is still quite some work. It would be even better if users could simply provide an expression (GREL, Jython, Clojure…) which would compute the bin in which a given value falls in. That would potentially let users better adapt the binning strategy to their own uses cases.
  • Expected outcomes: A new clustering method which accepts a user-defined expression
  • Skills required/preferred: both the backend (Java) and frontend (HTML/CSS/JS) will need adapting
  • Possible mentors: @wetneb
  • Relevant issues: #4301

Template

  • Difficulty:
  • Description:
  • Expected outcomes:
  • Skills required/preferred:
  • Possible mentors:
  • Relevant issues:
Clone this wiki locally