Examples of pandas and plotly in jupyter notebook #56

gwenzel · 2020-04-13T19:32:02Z

Description:

A new jupyter notebook was added, showing how to import data from pandas, and how to plot data using the plotly packages. The data and examples are the same as in the initial notebook written by @felipehuerta17 .

List of changes

Added pandas and plotly to pipfile.
Added datasets folder with data from Ealing.
Added notebook with examples on how to use pandas and plotly.

jia200x · 2020-04-13T20:45:47Z

Thanks for the contribution!

There are some comments that IMO should be addressed before it gets to a "ready to merge" state:

The commits are not following the Angular JS Commit.
Some commits are unrelated to this PR (e.g d3fbba9)
Commits should be atomic. A.k.a there should be one commit per semantic change. E.g one "chore(dataset): add csv with ealing data", one with "docs(jupyter): add Plotly example", etc

On the other hand, IMO this should be independent of pipenv. I would expect people to use this Jupyter right after doing "pip install opensir". So, neither "plotly" nor "pandas" should be in the Pipfile (although #35 added the dependency because it's intended to be shown as a Sphinx page).

Also, I would try to limit the scope of this file to one of these:

If this is intended to be used as a tutorial, I suggest to follow the approach of feat(model): extend fit to SIR-X #35 and add more verbose information. Then we can include it in the Sphinx documentation
On the other side if this is intended to be used as an example application, I suggest to move this to a "samples" directory, and then we can explicitly link the documentation to the list of available samples.

What do you think?

PS: Regarding comments 1-3, if you have any questions don't hesitate to ask @felipehuerta17 , @leandrolanzieri , @sasalatart or me.

felipehuerta17 · 2020-04-14T19:13:03Z

Hi all,

Thanks for the very well and polished contribution @gwenzel ! Besides from the TI aspects, this looks a great example on how:

Create a datasets/ealing.csv file. This is a much more reproducible approach and it differentiates the notebook from the SIR.ipynb. Additionally, it looks familiar for Data Scientists in terms of structure and data exploration.
The key message from the notebook is using Plotly to develop interactive visualizations. This enhances the user-friendliness of the notebooks and increases the impact of open-sir. I agree with @jia200x that markdown cells will be very beneficial, but I think that the narrative should be focused on the decision that you make @gwenzel to create these amazing plots!

Observations
As the SIR model is already explained on SIR.ipynb, extend the narrative through:

Please describe the usage of plotly.graph_objects and make_subplots.
Provide references to plotly documentation if necessary. Just
Take the reader "from the hand" and assume that he knows nothing. After the first plot, invite the reader or user to hover their mouse over the figure and explain what he should see.
Describe why this plots are better than in SIR.ipynb. For example, including a date time index in the x axis make the results much more communicable and compelling.
Comment after Long Term model: with Plotly it's possible to know exactly the day of the peak and the number of infected rather than proving an estimation of applying a function like "max". This facilitates greatly the process of obtaining results.

They look excellent and they are within the scope of open-sir, not only "plotly", because you are merging your modelling thinking with a data visualization approach.

2.1) Figure 2.1 looks great and it's much better displayed than using matplotlib.
Note that now model.solve() provides the solution in terms of number of infected, so you don't need to multiply by the population. I will make a pull request with the changes.
In the Sensitivity Analysis, is there a way to show the legends below the figures, so the figures can scale better in displays with smaller aspect ratio?

I think this should be a separate tutorial, and have crosslinks with SIR.ipynb. As the hello-world is SIR.ipynb, after the sensitivity analysis we can provide a link on SIR.ipynb to this notebook to show how to create more compelling visualizations. In a similar way, before importing data provide a brief description of the SIR model (no diff eqs), and provide a link to SIR.ipynb and the documentation so the reader can get more context.

felipehuerta17

Great contribution and first commit!

To be ready to merge:

Take onboard the specific comments on the different parts of the code.
Provide a narrative tutorial on "how to use plotly to increase the impact of the results that can be generated using open-sir"
With the new repository, you may need to do from opensir.models import SIR,SIRX
Let me know if you need help on doing the TI stuff with the commits. We can do this together in 5-10 minutes after you deal with the other changes. I tried to solve similar comments by my own in the past and took a lot of time for the first time and built frustration.

felipehuerta17 · 2020-04-14T19:15:30Z

Pipfile

@@ -14,6 +14,8 @@ matplotlib = "*"
 scipy = "*"
 jupyter = "*"
 sklearn = "*"
+pandas = "*"


This should be updated when you update your master repository and rebase this branch :)

"pandas" was already added, so this commit can be dropped

felipehuerta17 · 2020-04-14T19:17:59Z

Pipfile

@@ -14,6 +14,8 @@ matplotlib = "*"
 scipy = "*"
 jupyter = "*"
 sklearn = "*"
+pandas = "*"
+plotly = "*"


@jia200x @gwenzel here we need to make a decision whether we make plotly a dependency of opensir, or we write in the notebook that plotly is required and the user has to install it using pip install plotly in order to run the notebook.

At the moment, I wouldn't be against of making plotly a dependency as the plotly figures created by @gwenzel may constitute in the future a function of model.plot().

@jia200x @gwenzel here we need to make a decision whether we make plotly a dependency of opensir, or we write in the notebook that plotly is required and the user has to install it using pip install plotly in order to run the notebook.

As described above, the user of this Jupyter is the End User, not the developer user.
There are a lot of problems of using Pipenv for end users (e.g if the user doesn't have Python 3.7, it won't be able to run opensir).

At the moment, I wouldn't be against of making plotly a dependency as the plotly figures created by @gwenzel may constitute in the future a function of model.plot().

If the model.plot uses plotly, then that PR should add the dependency. Adding spare dependencies is not recommended because it adds more "moving parts" if not used

felipehuerta17 · 2020-04-14T19:18:41Z

datasets/ealing.csv

+3/26/2020,136
+3/27/2020,165
+3/28/2020,209
+3/29/2020,241


Love this and the idea of having a datasets folder

felipehuerta17 · 2020-04-14T19:20:41Z

test_pandas_plotly.ipynb

+    "# Convert into seconds\n",
+    "tf_long = long_term_days-1\n",
+    "sol_long = my_SIR_fitted.solve(tf_long, long_term_days).fetch()\n",
+    "N_S_long = sol_long[:,1]*P\n",


Substitute

N_S_long = sol_long[:,1]*P

for

N_S_long = sol_long[:,1]

As now the output of sol is in number of people.

Same applies for line 145 and 146

… input file.

gwenzel · 2020-04-16T17:41:56Z

Previous comments were taken into account and a new commit was made: e2ac3ea.
The new commit was also rebased to the new master.

gwenzel added the documentation Improvements or additions to documentation label Apr 13, 2020

gwenzel requested review from jia200x, sasalatart and felipehuerta17 April 13, 2020 19:32

gwenzel self-assigned this Apr 13, 2020

felipehuerta17 requested changes Apr 14, 2020

View reviewed changes

docs: added notebook with plotly and pandas examples, and example csv…

e2ac3ea

… input file.

gwenzel force-pushed the test branch from b6da0c0 to e2ac3ea Compare April 16, 2020 17:39

felipehuerta17 mentioned this pull request Apr 17, 2020

Bloated dependencies and scope of Jupyter notebooks #59

Open

sasalatart removed their request for review May 28, 2020 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples of pandas and plotly in jupyter notebook #56

Examples of pandas and plotly in jupyter notebook #56

gwenzel commented Apr 13, 2020

jia200x commented Apr 13, 2020 •

edited

felipehuerta17 commented Apr 14, 2020

felipehuerta17 left a comment

felipehuerta17 Apr 14, 2020

jia200x Apr 14, 2020

felipehuerta17 Apr 14, 2020

jia200x Apr 14, 2020

felipehuerta17 Apr 14, 2020

felipehuerta17 Apr 14, 2020

felipehuerta17 Apr 14, 2020

gwenzel commented Apr 16, 2020

Examples of pandas and plotly in jupyter notebook #56

Are you sure you want to change the base?

Examples of pandas and plotly in jupyter notebook #56

Conversation

gwenzel commented Apr 13, 2020

Description:

List of changes

jia200x commented Apr 13, 2020 • edited

felipehuerta17 commented Apr 14, 2020

felipehuerta17 left a comment

Choose a reason for hiding this comment

felipehuerta17 Apr 14, 2020

Choose a reason for hiding this comment

jia200x Apr 14, 2020

Choose a reason for hiding this comment

felipehuerta17 Apr 14, 2020

Choose a reason for hiding this comment

jia200x Apr 14, 2020

Choose a reason for hiding this comment

felipehuerta17 Apr 14, 2020

Choose a reason for hiding this comment

felipehuerta17 Apr 14, 2020

Choose a reason for hiding this comment

felipehuerta17 Apr 14, 2020

Choose a reason for hiding this comment

gwenzel commented Apr 16, 2020

jia200x commented Apr 13, 2020 •

edited