Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for non-English language Wikipedias #11

Open
pachevalier opened this issue Feb 27, 2018 · 5 comments
Open

Add support for non-English language Wikipedias #11

pachevalier opened this issue Feb 27, 2018 · 5 comments

Comments

@pachevalier
Copy link

Really nice tool. Are you planning to launch a version for frwiki and other wikipedias ?

@jwngr
Copy link
Owner

jwngr commented Mar 3, 2018

No plans currently, although it shouldn't be that hard in theory. I renamed this issue and will leave it open as a future enhancement.

@jwngr jwngr changed the title frwiki version Add support for non-English language Wikipedias Mar 3, 2018
@jwngr jwngr mentioned this issue Mar 3, 2018
@Djiit
Copy link

Djiit commented Jun 20, 2018

Hi there,

Thanks for your great job here!

I'm willing to help on this issue; you know, scratching my own itch.
Could you point me toward the good direction ? Where should I look first ?

Cheers

@jwngr
Copy link
Owner

jwngr commented Jun 24, 2018

Hi @Djiit - Thanks! It's awesome that you're willing to help out with this. I think there is a non-trivial amount of work to make this happen, mainly because all the code and the database assume everything is in a single language. It will be somewhat difficult to go from 1 to 2 languages, but should be easy to go from 2 to N. Here are all the things that come to mind that:

Database

  1. Update the database creation script to accept a new command line argument which specifies the language of the Wikipedia dump to use. This argument will need to be used in place of the string enwiki which is currently hardcoded in the script.
  2. Update the database creation script to store the downloaded and parsed files in dump/<language>/ instead of just dump/ since several languages will be downloaded at once.
  3. Update the database creation script to create multiple sdow.sqlite files, one for each language (e.g., sdow-en.sqlite, sdow-fr.sqlite).
  4. Add a new language column to the searches SQLite table.

Server

  1. Update the /paths endpoint to handle a new language property in the request body.
  2. Update the Database class to handle multiple languages and update the way it is initialized in the main server.py script.
  3. Specify the new language column when writing to the searches table.

Website

  1. Add a language selection setting (maybe as a dropdown) to the website's Home component to choose which language to use for the search.
  2. Add a language field to the POST body of the call to the server backend.

Documentation

  1. Update the Data Source documentation, especially the "Database Creation Script" and "Database Creation Process" instructions.
  2. Update the Web Server Setup documentation, especially around how to manage the multiple sdow.sqlite files.

So yeah, there are quite a few things to update, but none of them are terribly complex and I'm here to answer questions for you. Even if you only want to do a portion of this, I'd appreciate it. See Contributing for instructions on how to set up your local environment. It should be up to date, but if anything is wrong, feel free to ask me about it or send me a PR.

@DyeffersonAz
Copy link

My moral support here! 😋

@DyeffersonAz
Copy link

There could be allowed to make it possible in ANY mediawiki site.

(Just a dream XD)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants