Skip to content

Google Summer of Code Application Guide

Marcus Edel edited this page Feb 6, 2024 · 8 revisions

This page has some details on how you should prepare your proposal. It is only a guide so please do not feel constrained to follow it to the letter!

Things to think about before applying

Before you consider applying to work on mlpack for Google Summer of Code 2024, please consider the following points:

Students who are accepted into the Google Summer of Code program are expected to work full-time hours for the 12 weeks of the program -- that's 40 hours a week. It is not a good idea to try to do GSoC and also work another job; there simply isn't enough time. It's okay if your schedule doesn't allow 40 hours weeks every week (for instance, if you are going on vacation) but you'll have to let us know so that we can work around it.

mlpack is a complex library implementing difficult machine learning algorithms and sometimes using confusing C++ language features. Any GSoC project for mlpack will almost certainly involve becoming closely familiar with the internal workings of some of these algorithms. Although mentors exist to help out with this process, it's unreasonable to expect them to be able to explain every part of every paper you might need to read. So some level of self-motivation and definitely a willingness to learn are prerequisites.

Participation in the community (IRC/chat or the mailing list) is highly encouraged and helps us get a better picture of who you are, your abilities, and how you will be able to contribute to the library over the summer. In addition, the community is there to help you out! Feel free to ask questions, but do keep in mind that mlpack is a fairly small organization so there may not always be someone around to help.

Basic information

Once you've considered those three points, and want to write an application, make sure that your application has the following basic information. If you've already made yourself known to the community, make sure any aliases you've used there are posted here, so that we can know this application corresponds to someone we have already met.

  • Name
  • University / school
  • Field of study
  • Date study was started
  • Expected graduation date
  • Homepage
  • Email
  • IRC nick (if applicable)
  • Interests and hobbies

Also, be sure to answer the following questions about your technical proficiency and coding skills:

  • What languages do you know? Rate your experience level (1-5: rookie-guru) for each.
  • How long have you been coding in those languages?
  • Are you a contributor to other open-source projects?
  • Do you have a link to any of your work (i.e. github profile)?
  • What areas of machine learning are you familiar with?
  • Have you taken any coursework relevant to machine learning?

Optional questions

Here are some questions intended for us to get an idea of who you are. You aren't required to answer them, but it is helpful for us. There is no right answer; these are open-ended questions.

  • What are your long-term plans, if you have figured those out yet? Where do you hope to see yourself in 10 years?
  • Describe the most interesting application of machine learning you can think of, and then describe how you might implement it.
  • Both algorithm implementation and API design are important parts of mlpack. Which is more difficult? Which is more important? Why?

The project proposal

The most important part of a proposal is a project proposal. After including the information above, you should write a project proposal, which should clearly outline the following things:

  • The objectives of the proposed project. This should detail what you expect to have accomplished at the end of the summer.
  • Some background information on the project. This may include descriptions of the algorithms / data structures / ideas you plan to implement. Especially for more complex ideas, be sure to detail background information that a person who is reasonably familiar with machine learning and mlpack will be able to understand your description without needing to consult other references (but, when relevant, do link to other references).
  • If applicable, describe the API of your finished project and how it will fit in with the rest of the mlpack code. We realize that it's impossible to know exactly how the API will work when the project is not yet started, but you should at least have a basic idea of how you might want it to work.
  • Describe how you will test your project. Keep in mind that with machine learning algorithms, testing is often the most difficult part of the process, and it can be very hard to write good tests -- especially for statistical algorithms.
  • Describe a timeline of some sort for your project, broken down week by week (or thereabouts). In reality we understand that things don't often go according to plan, but having a predefined plan can significantly help the development process.

Other information and tips for success

Please feel free to include any other information that you think is relevant but none of the above questions cover.

Being accepted into Google Summer of Code is a competitive process and the proposal process may be time-consuming. Prospective students who have already contributed code to the library and are participating in the community are more likely to be selected, as seeing contributions is helpful for displaying your talent and ability.

One good place to start might be the "get involved" page on the mlpack website:

https://www.mlpack.org/community.html

We look forward to hearing from you!