Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[joss] software paper comments #21

Open
hbaniecki opened this issue Dec 5, 2021 · 1 comment
Open

[joss] software paper comments #21

hbaniecki opened this issue Dec 5, 2021 · 1 comment
Labels
documentation Improvements or additions to documentation

Comments

@hbaniecki
Copy link
Contributor

hbaniecki commented Dec 5, 2021

openjournals/joss-reviews#3934 Hi, I hope these comments help in improving the paper.

Comments

  1. The paper's title could see a change. It says "PySS3: A new interpretable and simple machine learning model for text classification", but the model is named "SS3" and seems not new. The title of the repository seems more accurate, "A Python package implementing a new simple and interpretable model for text classification", but even then one could drop "new" and use the PyPI package's title, e.g. "PySS3: A Python package implementing the SS3 interpretable text classifier [with interactive/visualization tools for explainable AI]". Just an example to be considered.
  2. I would recommend the authors to highlight in the article the software's aspect of "interactive" (explanation, analysis) and (model, machine learning) "monitoring" as this seems both novel and emerging in discussions lately.
  3. In the end, it would be useful to release a stable version 1.0 of the package (on GitHub, PyPI) and mark that in the paper, e.g. in the Summary section.

Summary

  • L10. "implements novel machine learning model" - It might not be seen as novel when the model was already published in 2019 and extended in 2020.
  • L11. mentioning "two useful tools" without describing what the second does seems off

Statement of need
This part discusses mainly the need for open-source implementation of the machine learning models. However, as I see it, the significant contributions of the software/paper, distinguishing it from the previous work, are the Live_Test/Evaluation tools allowing for visual explanation and hyperparameter optimization. This could be further underlined.

State of the field
The paper lacks a brief discussion on packages in the field of interpretable and explainable machine learning. In that, I suggest the authors reference/compare to the following software related to interactive explainability:

  1. Wexler et al. "The What-If Tool: Interactive Probing of Machine Learning Models" (IEEE TVCG, 2019) https://doi.org/10.1109/TVCG.2019.2934619
  2. Tenney et al. "The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models" (EMNLP, 2020) http://doi.org/10.18653/v1/2020.emnlp-demos.15
  3. Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann. "exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models" (ACL, 2020) https://www.doi.org/10.18653/v1/2020.acl-demos.22
  4. [Ours] "dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python" (JMLR, 2021) https://www.jmlr.org/papers/v22/20-1473.html

Other possibly missing/useful references:

  1. Pedregosa et al. "Scikit-learn: Machine Learning in Python" (JMLR, 2011) https://www.jmlr.org/papers/v12/pedregosa11a.html
  2. Christoph Molnar "Interpretable Machine Learning - A Guide for Making Black Box Models Explainable" (book, 2018) https://christophm.github.io/interpretable-ml-book
  3. Cynthia Rudin "Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead" (Nature Machine Intelligence, 2019) https://doi.org/10.1038/s42256-019-0048-x
  4. [Ours] "modelStudio: Interactive Studio with Explanations for ML Predictive Models" (JOSS, 2019) https://doi.org/10.21105/joss.01798

Implementation

  • L48 github -> GitHub
  • L54 "such as the one introduced later by the same authors" -> "by us" would be easier to read
  • L57 missing the citation of scikit-learn

Illustrative examples

  1. In the beginning, it lacks a brief description of the predictive task used for the example (dataset name, positive/negative text classification, etc.).
  2. Also, it could now be updated with the Dataset.load_from_url() function.

Conclusions
Again, I have doubts that the machine learning model is "novel", as it has been previously published etc.. It might be misunderstood as "introducing a novel machine learning model".

@sergioburdisso sergioburdisso added the documentation Improvements or additions to documentation label Dec 6, 2021
sergioburdisso added a commit that referenced this issue Apr 4, 2022
@sergioburdisso
Copy link
Owner

Hi @hbaniecki!

It's been forever, I'm so sorry! I've moved to Switzerland and started working as a postdoctoral researcher here, and it was a huge change in my life.

I've addressed most of the points you highlighted (commit 61c8419), btw THANKS for your valuable advices. Below I'll address each one of your points:

Comments

  1. The title has been updated following your guidance, now it is "PySS3: A Python package implementing SS3, a simple and interpretable machine learning model for text classification", the word novel has been removed from the paper, as you suggested.
  2. This point will be addressed as part of the "statement of need" following the other points suggested by you there.
  3. I don't think the API is stable enough yet for a 1.0 version, but I will release a new version with the new changes (including loading dataset from url) and reference to that version in the paper, do you think it is OK? of course if you don't agree we can talk about it, no problem! :)

Summary

  • Both points were fixed.

Statement of need
Yes, I totally agree, in the initial version I didn't include it due to space limitation (in fact the paper exceeded the 1000 words limitation). This point and the ones you pointed out in the following item I'll addressed them both in this section. The idea is to talk about the interpretability and explainability a little bit, cite the papers you suggested, and then add the "gap phrase", like "However, little attention has been paid" etc. and focus on the need of interpretable models (not just explainable, but interpretable, i.e. self-explainable). What do you think?

State of the field
These references and this discussion will be added above.

Implementation

  • All three points have been fixed.

Illustrative examples

  1. I've added a brief description of the dataset at the beginning.
  2. Updated the example using the Dataset.load_from_url() function.

Conclusions
I've changed the conclusion removing "novel" and adding an extra sentence.

I'm still working on the changes regarding the "Statement of need", I'll let you know as soon as I finish with it. Again, thank you so much for your review work, and apologize for the delay....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants