Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outstanding issues not specific to any tips #252

Open
SiminaB opened this issue Oct 7, 2020 · 10 comments
Open

Outstanding issues not specific to any tips #252

SiminaB opened this issue Oct 7, 2020 · 10 comments

Comments

@SiminaB
Copy link
Collaborator

SiminaB commented Oct 7, 2020

This is to discuss any issues that we may think are not currently adequately covered. If they relate to specific tips, use #242 #243 #244 #245 #246 #247 #248 #249 #250 #251

@SiminaB
Copy link
Collaborator Author

SiminaB commented Oct 7, 2020

In re-rereading this, there are 2 issues that I thought about that we may want to cover. At minimum, I think many people reading this paper will expect them to be covered. I think they can be included in the Intro or Conclusion or as part of existing tips:

  1. How does one go about fitting these models and is special software always required? We can at least give some good references for how to do this and note the main packages and computational requirements. I know this isn't a "getting started with DL" paper, but we can still spend 2-3 sentences on it.
  2. Can DL be inadvertently used to perpetuate existing stereotypes eg racist and sexist ones? We know this can happen either because of the training set (eg training set consists exclusively of individuals of European descent, then model is used on a more diverse population) or because of the predictions are incorrectly interpreted due to confounding (eg the training set has doctors and nurses and most doctors are men and most nurses are women, therefore going forward gender is either explicitly or implicitly used to play an outsized role in predicting career choice.) The paper focuses on biology, so perhaps one good example would be the performance of face recognition approaches on individuals of European vs. non-European descent.

@Benjamin-Lee
Copy link
Owner

Some thoughts in response:

  1. We should mention them as well as mention using auto-ML tools like TPOT.
  2. DL fairness should probably be mentioned in the interpretation or privacy tips. Which place do you think is better?

@SiminaB
Copy link
Collaborator Author

SiminaB commented Oct 8, 2020

We could change Tip 10 to be about ethics I guess? That way both fairness and privacy would fit.

@Benjamin-Lee
Copy link
Owner

@SiminaB I just addressed your first point in the PR for #241. Specifically, I mentioned TF and PyTorch as well as Keras, AutoKeras, Turi Create, and TPOT. If there are any other tools you think are worth mentioning, do let me know.

@SiminaB
Copy link
Collaborator Author

SiminaB commented Oct 11, 2020

Looks good! One question as someone who doesn't use DL in research - can you actually run meaningful DL models on a laptop? The implication is that it would be hard to do so, eg in:

In contrast, traditional ML training can often be done on a laptop (or even a $5 computer [@arXiv:1809.00238]) in seconds to minutes.

@Benjamin-Lee
Copy link
Owner

It's doable in some cases but not really ideal. In my experience, I've always ended up having to use a cloud machine for training all but the simplest models. I've never done transfer learning so I can't comment on whether that brings things down to consumer-grade laptop level. @rasbt probably knows more than I do about that.

@SiminaB
Copy link
Collaborator Author

SiminaB commented Oct 11, 2020

I think it would be helpful to clarify this as it would help inform someone whether they can actually do DL. If it is appropriate to their problem but not really doable on their device, of course they can look into using the cloud or initiating a collaboration.

@Benjamin-Lee
Copy link
Owner

Definitely a good idea to speak affirmatively to what DL needs.

@agitter
Copy link
Collaborator

agitter commented Jan 26, 2021

I'm copying my comment from #313 (comment) here so we don't lose track of it.

  • There is a lot of existing guidance about best practices for machine learning and deep learning that we do not reference
  • The examples we provide in the intro and elsewhere are pretty arbitrary and not necessarily representative or the most impressive applications
  • Some tips still have no biology examples
  • Second person is not used consistently (Second person or third person? #237)
  • Some tips (e.g. 4) aren't very specific to deep learning
  • There is some redundancy across tips

These are all minor enough to address after the initial submission.

@Benjamin-Lee
Copy link
Owner

Thank you for adding it here and glad to see nothing else is blocking. I'll work on #237 once we do the content freeze since that is cosmetic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants