Skip to content

GSOC 2018 Guide

Aneesh Joshi edited this page Feb 14, 2018 · 7 revisions

Into

Gensim is participating in GSoC 2018 under the NumFOCUS umbrella. General useful information for students available here.

How to choose a project

The main part of GSoC is 3 months of coding. This is a really long time and it’s critical that you feel comfortable and are completely dedicated.

First of all, have a look at our GSoC 2018 ideas page and the Gensim 2018 Roadmap, to understand our OSS plans. If you didn’t find a project that fits exactly, don't worry - feel free to suggest any project related to NLP and text processing that you feel would delight other users. At the same time, projects that are mindful of our roadmap and broad goals are preferred.

Skills you'll need

Base skills:

  • Git
  • Python
  • GitHub ("How to submit PR", "How to merge upstream from repository", etc)

Additional skills (depends on concrete project):

  • Cython, or C or C++ with Python bindings (for implementing new models from scratch or optimization projects)
  • PyTorch/Keras for NN projects.

Getting Started Early

Experience shows that the best thing to help your application is to contact the project you want to work with early.

You can do several things:

Proposal tips (in addition to the great NumFOCUS intro)

  1. Have a look at good proposals from the previous years: Parul, Prakhar
  2. Write you own proposal, mentioning:
    • The project name.
    • Your personal information (name, github account, email).
    • A detailed description, with motivating examples of how the community will benefit from your project. Pay attention to the "whys", not only "hows". Why should this task be done at all?
    • Implementation plan.
    • Timeline.
  3. Send it to student-projects@rare-technologies.com with the subject "GSoC 2018 <NAME_SURNAME>":
    • Your proposal (PDF)
    • Your CV
    • A letter covering your personal motivation to participate in GSoC with Gensim, your experience and background.
  4. Submit your proposal to the NumFOCUS-repo and to Google.