[joss] software paper comments #21

hbaniecki · 2021-12-05T23:20:58Z

openjournals/joss-reviews#3934 Hi, I hope these comments help in improving the paper.

Comments

The paper's title could see a change. It says "PySS3: A new interpretable and simple machine learning model for text classification", but the model is named "SS3" and seems not new. The title of the repository seems more accurate, "A Python package implementing a new simple and interpretable model for text classification", but even then one could drop "new" and use the PyPI package's title, e.g. "PySS3: A Python package implementing the SS3 interpretable text classifier [with interactive/visualization tools for explainable AI]". Just an example to be considered.
I would recommend the authors to highlight in the article the software's aspect of "interactive" (explanation, analysis) and (model, machine learning) "monitoring" as this seems both novel and emerging in discussions lately.
In the end, it would be useful to release a stable version 1.0 of the package (on GitHub, PyPI) and mark that in the paper, e.g. in the Summary section.

Summary

L10. "implements novel machine learning model" - It might not be seen as novel when the model was already published in 2019 and extended in 2020.
L11. mentioning "two useful tools" without describing what the second does seems off

Statement of need
This part discusses mainly the need for open-source implementation of the machine learning models. However, as I see it, the significant contributions of the software/paper, distinguishing it from the previous work, are the Live_Test/Evaluation tools allowing for visual explanation and hyperparameter optimization. This could be further underlined.

State of the field
The paper lacks a brief discussion on packages in the field of interpretable and explainable machine learning. In that, I suggest the authors reference/compare to the following software related to interactive explainability:

Wexler et al. "The What-If Tool: Interactive Probing of Machine Learning Models" (IEEE TVCG, 2019) https://doi.org/10.1109/TVCG.2019.2934619
Tenney et al. "The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models" (EMNLP, 2020) http://doi.org/10.18653/v1/2020.emnlp-demos.15
Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann. "exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformer Models" (ACL, 2020) https://www.doi.org/10.18653/v1/2020.acl-demos.22
[Ours] "dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python" (JMLR, 2021) https://www.jmlr.org/papers/v22/20-1473.html

Other possibly missing/useful references:

Pedregosa et al. "Scikit-learn: Machine Learning in Python" (JMLR, 2011) https://www.jmlr.org/papers/v12/pedregosa11a.html
Christoph Molnar "Interpretable Machine Learning - A Guide for Making Black Box Models Explainable" (book, 2018) https://christophm.github.io/interpretable-ml-book
Cynthia Rudin "Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead" (Nature Machine Intelligence, 2019) https://doi.org/10.1038/s42256-019-0048-x
[Ours] "modelStudio: Interactive Studio with Explanations for ML Predictive Models" (JOSS, 2019) https://doi.org/10.21105/joss.01798

Implementation

L48 github -> GitHub
L54 "such as the one introduced later by the same authors" -> "by us" would be easier to read
L57 missing the citation of scikit-learn

Illustrative examples

In the beginning, it lacks a brief description of the predictive task used for the example (dataset name, positive/negative text classification, etc.).
Also, it could now be updated with the Dataset.load_from_url() function.

Conclusions
Again, I have doubts that the machine learning model is "novel", as it has been previously published etc.. It might be misunderstood as "introducing a novel machine learning model".

The text was updated successfully, but these errors were encountered:

sergioburdisso · 2022-04-04T07:58:11Z

Hi @hbaniecki!

It's been forever, I'm so sorry! I've moved to Switzerland and started working as a postdoctoral researcher here, and it was a huge change in my life.

I've addressed most of the points you highlighted (commit 61c8419), btw THANKS for your valuable advices. Below I'll address each one of your points:

Comments

The title has been updated following your guidance, now it is "PySS3: A Python package implementing SS3, a simple and interpretable machine learning model for text classification", the word novel has been removed from the paper, as you suggested.
This point will be addressed as part of the "statement of need" following the other points suggested by you there.
I don't think the API is stable enough yet for a 1.0 version, but I will release a new version with the new changes (including loading dataset from url) and reference to that version in the paper, do you think it is OK? of course if you don't agree we can talk about it, no problem! :)

Summary

Both points were fixed.

Statement of need
Yes, I totally agree, in the initial version I didn't include it due to space limitation (in fact the paper exceeded the 1000 words limitation). This point and the ones you pointed out in the following item I'll addressed them both in this section. The idea is to talk about the interpretability and explainability a little bit, cite the papers you suggested, and then add the "gap phrase", like "However, little attention has been paid" etc. and focus on the need of interpretable models (not just explainable, but interpretable, i.e. self-explainable). What do you think?

State of the field
These references and this discussion will be added above.

Implementation

All three points have been fixed.

Illustrative examples

I've added a brief description of the dataset at the beginning.
Updated the example using the Dataset.load_from_url() function.

Conclusions
I've changed the conclusion removing "novel" and adding an extra sentence.

I'm still working on the changes regarding the "Statement of need", I'll let you know as soon as I finish with it. Again, thank you so much for your review work, and apologize for the delay....

sergioburdisso added the documentation Improvements or additions to documentation label Dec 6, 2021

sergioburdisso added a commit that referenced this issue Apr 4, 2022

Update JOSS paper (issue #21)

61c8419

bmcfee mentioned this issue Apr 22, 2022

[REVIEW]: PySS3: A new interpretable and simple machine learning model for text classification openjournals/joss-reviews#3934

Closed

40 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[joss] software paper comments #21

[joss] software paper comments #21

hbaniecki commented Dec 5, 2021 •

edited

sergioburdisso commented Apr 4, 2022

[joss] software paper comments #21

[joss] software paper comments #21

Comments

hbaniecki commented Dec 5, 2021 • edited

sergioburdisso commented Apr 4, 2022

hbaniecki commented Dec 5, 2021 •

edited