Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a Dashboard for Interactive Data Visualization with Dash in Python #609

Open
hawc2 opened this issue Mar 19, 2024 · 18 comments
Open

Comments

@hawc2
Copy link
Collaborator

hawc2 commented Mar 19, 2024

Programming Historian in English has received a proposal for a lesson, 'Creating a Dashboard for Interactive Data Visualization with Dash in Python' by @hluling.

I have circulated this proposal for feedback within the English team. We have considered this proposal for:

  • Openness: we advocate for use of open source software, open programming languages and open datasets
  • Global access: we serve a readership working with different operating systems and varying computational resources
  • Multilingualism: we celebrate methodologies and tools that can be applied or adapted for use in multilingual research-contexts
  • Sustainability: we're committed to publishing learning resources that can remain useful beyond present-day graphical user interfaces and current software versions

We are pleased to have invited @hluling to develop this Proposal into a Submission to be developed under the guidance of @caiocmello as editor.

The Submission package should include:

  • Lesson text (written in Markdown)
  • Figures: images / plots / graphs (if using)
  • Data assets: codebooks, sample dataset (if using)

We ask @hluling to share their Submission package with our Publishing team by email, copying in @caiocmello .

We've agreed a submission date of April. We ask @hluling to contact us if they need to revise this deadline.

When the Submission package is received, our Publishing team will process the new lesson materials, and prepare a Preview of the initial draft. They will post a comment in this Issue to provide the locations of all key files, as well as a link to the Preview where contributors can read the lesson as the draft progresses.

If we have not received the Submission package by April, @caiocmello will attempt to contact @hluling. If we do not receive any update, this Issue will be closed.

Our dedicated Ombudspersons are Ian Milligan (English), Silvia Gutiérrez De la Torre (español), Hélène Huet (français), and Luis Ferla (português) Please feel free to contact them at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudspersons will have no impact on the outcome of any peer review.

@charlottejmc
Copy link
Collaborator

charlottejmc commented Apr 17, 2024

Hello @caiocmello and @hluling,

You can find the key files here:

You can review a preview of the lesson here:


I do have a question about two .py files in the assets. As far as I can understand,

  • app-rq2.py is a script to download the data for Research Question 2 (RQ2)
  • rq2-download.py is a script showing how the dashboard was set up for RQ2

How come these scripts are provided separately, rather than included as code blocks within the lesson? (I am slightly confused about how these scripts differ from the main code, which you've collated together under app.py.)

Thank you for clarifying!

@anisa-hawes
Copy link
Contributor

anisa-hawes commented Apr 17, 2024

Thank you for processing these files, @charlottejmc!


Hello Luling @hluling,

What's happening now?

Your lesson has been moved to the next phase of our workflow which is Phase 2: Initial Edit.

In this Phase, your editor Caio @caiocmello will read your lesson, and provide some initial feedback. Caio will post feedback and suggestions as a comment in this Issue, so that you can revise your draft in the following Phase 3: Revision 1.

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 1 <br> Submission
Who worked on this? : Publishing Assistant (@charlottejmc) 
All  Phase 1 tasks completed? : Yes
Section Phase 2 <br> Initial Edit
Who's working on this? : Editor (@caiocmello)  
Expected completion date? : May 17
Section Phase 3 <br> Revision 1
Who's responsible? : Author (@hluling) 
Expected timeframe? : ~30 days after feedback is received

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

@hluling
Copy link
Collaborator

hluling commented Apr 17, 2024

Hello @caiocmello and @hluling,

You can find the key files here:

You can review a preview of the lesson here:

I do have a question about two .py files in the assets. As far as I can understand,

  • app-rq2.py is a script to download the data for Research Question 2 (RQ2)
  • rq2-download.py is a script showing how the dashboard was set up for RQ2

How come these scripts are provided separately, rather than included as code blocks within the lesson? (I am slightly confused about how these scripts differ from the main code, which you've collated together under app.py.)

Thank you for clarifying!

Thank you @charlottejmc. To clarify:

  • app-rq2.py is a script showing how the dashboard was set up for RQ2
  • rq2-download.py is a script to download the data for RQ2

The two RQs are based on two different data sources. app.py has the code for RQ1, the two other .py files are for RQ2. The main procedure and logic is demonstrated with RQ1, so I thought it might be repetitive to explain it again with RQ2. But I'd be happy to incorporate the RQ2 code into the lesson main text if that makes more sense.

The reason to separate app-rq2.py from rq2-download.py is that it takes some time to retrieve data using the Chronicling America API (RQ2), so it's not practical to incorporate the download procedure into the dashboard script.

@charlottejmc
Copy link
Collaborator

Thank you @hluling, that makes good sense to me now. Anisa and I did find this slightly confusing upon initial processing of the lesson, so this might indicate it will be confusing to readers as well. One solution would be to keep the code in a separate asset folder, but give clearer instructions to readers explaining this choice.

I will let @caiocmello share his view on this too!

@caiocmello
Copy link
Collaborator

caiocmello commented May 8, 2024

Dear @hluling,

It has been such a pleasure reading your lesson. I've learnt a lot from it and I'm sure it will be of great contribution to the PH! So, thanks very much for this! I took note of some suggestions I could provide you at this stage before it goes to external review. I hope they are useful in improving the accessibility and usability of this material. Comments below indicate the paragraph, as annotated in the preview version.

  • Paragraph 5: Although it is okay and, actually, recommended to keep research questions simple for this tutorial, I would suggest a slight change in the text to make it more accurate. The fact that the U.S. television stations mention words such as Putin and Zelensky in the same frequency doesn’t mean, necessarily, 'balanced coverage of the event'. Therefore, I would suggest avoiding the word ‘balanced’ by simply adapting the research question to something like: ‘...concerns how the U.S. television stations have covered the war in Ukraine. One way to address…’ (or explaining what you mean by 'balanced').

  • Paragraph 21: It would be interesting to add a line here to state the parameters you chose. Eg.: ‘For the purpose of this lesson, keywords chosen are x,y, z. The geographic market is x…’.

  • Scripts are linked in the other way around:

  • Paragraph 39: The link for the script (data acquisition) is wrong. It should be https://github.com/programminghistorian/ph-submissions/blob/gh-pages/assets/interactive-data-visualization-dashboard/rq2-download.py

  • Paragraph 41: The link for the script (coding dashboard) is wrong. It should be https://github.com/programminghistorian/ph-submissions/blob/gh-pages/assets/interactive-data-visualization-dashboard/app-rq2.py

  • Other general suggestions:

  • I think the reader would benefit from seeing a spoiler of the final product at the beginning of the lesson. It could be a screenshot of the dashboard (or even the link for the live demo version you provided (https://ph-dash-demo.onrender.com/). It would have helped me to understand what my goal was if I had seen it before starting the lesson. (Let me know what you think about this).

  • It would have also been helpful to know more about (to have a general overview of) the dataset from the beginning. This could be a screenshot, a table or 'schema'. Something that explains what is in there and how it is structured.

  • Regarding RQ1 and RQ2:

It was great to see that you included more than one research question in the lesson. Also, you provide a different set-up of the dashboard, showing how readers can customise it in different ways. This is excellent. I have, however, some suggestions regarding the way the RQs are structured in the text:

  • You present the two RQs at the beginning of the lesson. But it makes me (as a reader) feel like you forgot the second RQ along the text. In paragraph 17, for example, you say: 'To address the research question...', and I was confused whether you were talking about RQ1 or RQ2. Therefore, I would suggest mentioning at the beginning of the lesson that you will provide an 'extra' RQ for those interested in learning by building and customising different dashboards. This way, RQ2 would be presented just at the end of the text and framed as 'extra' (non-essential) content of this lesson.
  • Considering this lesson is focused on data visualisation, would it be possible to provide the data for RQ2, instead of providing the code to download it? I appreciate the exercise of collecting the data, but in this case, I think it took me a very long time to download it, when I was primarily focused on visualising. Also, if kept in the lesson, I think the script for downloading data for RQ2 would have to be explained in detail in the lesson, as it is not simple.
  • It would also be great to see a spoiler of the dashboard for RQ2. This way the reader can choose whether it is worth it to look at the 'extra' content.

Final comment:

  • All my suggestions are based on the understanding that this is an advanced lesson, as it involves cloning repositories, creating virtual environments, navigating directories using the command line, reading API documentation, writing functions, etc. Therefore, I don't see the immediate need to detail the code much more.

These are my initial suggestions and I look forward to hearing back from you. I hope this is useful and feel free to get in touch if you have any questions.

@charlottejmc
Copy link
Collaborator

Thank you very much @caiocmello – just a short note to let you and @hluling know that I've just taken care of switching the two asset links at paragraphs 39 and 41.

@hluling
Copy link
Collaborator

hluling commented May 12, 2024

Thank you @caiocmello for the insightful feedback! I'm working on the edits.
@charlottejmc: Thanks for changing the links! Do I just upload the updated materials to my original repo? I also want to insert figures, and I'm looking at the instructions described here. Am I supposed to add the .png files to my original repo? I'll refer to the figures in the main text.

@anisa-hawes
Copy link
Contributor

anisa-hawes commented May 12, 2024

Hello Luling @hluling,

If you'd like to slot in some figure images, please either upload them to your repository where we can download them or email to us as before. Charlotte and I will process these next week and put them in place for you!

Thank you,
Anisa

@anisa-hawes
Copy link
Contributor

What's happening now?

Hello Luling @hluling. Your lesson has been moved to the next phase of our workflow which is Phase 3: Revision 1.

This Phase is an opportunity for you to revise your draft in response to @caiocmello's initial feedback.

I've sent you an invitation to join us as an Outside Collaborator here on GitHub. This gives you the Write access you'll need to edit your lesson directly.

We ask authors to work on their own files with direct commits: we prefer you don't fork our repo, or use the Pull Request system to edit in ph-submissions. You can make direct commits to your file here: /en/drafts/originals/interactive-data-visualization-dashboard.md. Charlotte and I can help if you encounter any practical problems!

When you and Caio are both happy with the revised draft, we will move forward to Phase 4: Open Peer Review.

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 2 <br> Initial Edit
Who worked on this? : Editor (@caiocmello) 
All  Phase 2 tasks completed? : Yes
Section Phase 3 <br> Revision 1
Who's working on this? : Author (@hluling)  
Expected completion date? : June 12
Section Phase 4 <br> Open Peer Review
Who's responsible? : Reviewers (TBC) 
Expected timeframe? : ~60 days after request is accepted

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

@hluling
Copy link
Collaborator

hluling commented May 13, 2024

Hi @caiocmello, thanks again for the thorough review! Please see the revised lesson here: https://programminghistorian.github.io/ph-submissions/en/drafts/originals/interactive-data-visualization-dashboard
Feel free to let me know if I need to change anything else. Here is the list responding to each of your comments.

  • Paragraph 5: Although it is okay and, actually, recommended to keep research questions simple for this tutorial, I would suggest a slight change in the text to make it more accurate. The fact that the U.S. television stations mention words such as Putin and Zelensky in the same frequency doesn’t mean, necessarily, 'balanced coverage of the event'. Therefore, I would suggest avoiding the word ‘balanced’ by simply adapting the research question to something like: ‘...concerns how the U.S. television stations have covered the war in Ukraine. One way to address…’ (or explaining what you mean by 'balanced').

Revised as suggested (now in Paragraph 9).

  • Paragraph 21: It would be interesting to add a line here to state the parameters you chose. Eg.: ‘For the purpose of this lesson, keywords chosen are x,y, z. The geographic market is x…’.

Revised as suggested (now in Paragraph 30).

  • I think the reader would benefit from seeing a spoiler of the final product at the beginning of the lesson. It could be a screenshot of the dashboard (or even the link for the live demo version you provided (https://ph-dash-demo.onrender.com/). It would have helped me to understand what my goal was if I had seen it before starting the lesson. (Let me know what you think about this).

This is a great idea. I added Figure 1 and Figure 2 showing screenshots for the two dashboard.

  • It would have also been helpful to know more about (to have a general overview of) the dataset from the beginning. This could be a screenshot, a table or 'schema'. Something that explains what is in there and how it is structured.

I agree. I added Figure 3 and Figure 4 showing screenshots for the two datasets.

  • You present the two RQs at the beginning of the lesson. But it makes me (as a reader) feel like you forgot the second RQ along the text. In paragraph 17, for example, you say: 'To address the research question...', and I was confused whether you were talking about RQ1 or RQ2. Therefore, I would suggest mentioning at the beginning of the lesson that you will provide an 'extra' RQ for those interested in learning by building and customising different dashboards. This way, RQ2 would be presented just at the end of the text and framed as 'extra' (non-essential) content of this lesson.

I've revised and adjusted the language about the role of the two RQs (Paragraphs 4 and 27).

  • Considering this lesson is focused on data visualisation, would it be possible to provide the data for RQ2, instead of providing the code to download it? I appreciate the exercise of collecting the data, but in this case, I think it took me a very long time to download it, when I was primarily focused on visualising. Also, if kept in the lesson, I think the script for downloading data for RQ2 would have to be explained in detail in the lesson, as it is not simple.

I've added a link to download the dataset directly (Paragraph 48).

  • It would also be great to see a spoiler of the dashboard for RQ2. This way the reader can choose whether it is worth it to look at the 'extra' content.

The added Figure 2 shows a screenshot of the RQ2 dashboard.

@hluling
Copy link
Collaborator

hluling commented May 13, 2024

Hi Anisa @anisa-hawes (thanks for the reply!) and @charlottejmc,

I've placed the 4 figures here: https://github.com/hluling/ph-dash/tree/master/interactive-data-visualization-dashboard. You can find the figure placeholders in the revised lesson draft: https://programminghistorian.github.io/ph-submissions/en/drafts/originals/interactive-data-visualization-dashboard

Also a quick note: I updated some files here: https://github.com/programminghistorian/ph-submissions/tree/gh-pages/assets/interactive-data-visualization-dashboard

@charlottejmc
Copy link
Collaborator

Thank you @hluling, I've uploaded your four images and updated the placeholder links in the markdown file.

@caiocmello
Copy link
Collaborator

Hi @hluling,

It looks great! Thanks for the rapid response and for your engagement in the process! @anisa-hawes and @charlottejmc will process the lesson to the next stage of external peer-reviewing. I will write to you soon once reviewers are assigned.

Best wishes,
Caio

@anisa-hawes
Copy link
Contributor

anisa-hawes commented May 16, 2024

Hello Luling @hluling,

What's happening now?

Your lesson has been moved to the next phase of our workflow which is Phase 4: Open Peer Review.
This Phase will be an opportunity for you to hear feedback from peers in the community.

Caio @caiocmello has invited two reviewers to read your lesson, test your code, and provide constructive feedback. In the spirit of openness, reviews will be posted as comments in this Issue (unless you specifically request a closed review).

After both reviews, Caio will summarise the suggestions to clarify your priorities in Phase 5: Revision 2.

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 3 <br> Revision 1
Who worked on this? : Author (@hluling)
All  Phase 3 tasks completed? : Yes
Section Phase 4 <br> Open Peer Review
Who's working on this? : Diego Alves + Johannes Breuer
Expected completion date? : 22 July
Section Phase 5 <br> Revision 2
Who's responsible? : Author (@hluling)
Expected timeframe? : ~30 days after editor's summary

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

@anisa-hawes
Copy link
Contributor

anisa-hawes commented May 19, 2024

Hello Luling @hluling,

I noticed that you updated the Open in Colab button link: e5c8350 but this was correct as we had it set up: https://colab.research.google.com/github/programminghistorian/ph-submissions/blob/gh-pages/assets/interactive-data-visualization-dashboard/interactive-data-visualization-dashboard.ipynb. We are hosting your Python notebook within our organisational Colab space and syncing this copy with the assets folder on our repo. If you want to make any edits or adjustments to the notebook, please coordinate with us. Either we can make the changes on your behalf, or we can add you as a co-editor on our Master copy of your notebook. We are not making any direct edits to the notebook here in the repo, rather we update the Master copy on Colab, then re-sync.

Thank you, Anisa

@hluling
Copy link
Collaborator

hluling commented May 19, 2024

Thanks, @anisa-hawes. Sorry about that. I will let you know when there are changes.

@charlottejmc
Copy link
Collaborator

charlottejmc commented May 22, 2024

Hi @hluling,

I apologise for the confusion. When you updated the notebook in your Phase 3 Revision commit, we didn't realise that this had replaced the code behind the Open in Colab button. I do appreciate that you noticed and tried to rectify it later!

However, what we actually need is for the link to refer back to the notebook as hosted on our own GitHub repo: you can see that I've changed it back to link to:
https://colab.research.google.com/github/programminghistorian/ph-submissions/blob/gh-pages/assets/interactive-data-visualization-dashboard/interactive-data-visualization-dashboard.ipynb, rather than:
https://colab.research.google.com/github/hluling/ph-dash/blob/master/interactive-data-visualization-dashboard.ipynb (While you'd renamed the .ipynb file correctly here, the button was still linking back to your own GitHub repo.)

@caiocmello
Copy link
Collaborator

Open Peer Review

During Phases 2 and 3, I provided initial feedback on this lesson, then worked with @hluling to complete a first round of revisions.

In Phase 4 Open Peer Review, we invite feedback from others in our community.

Welcome Diego Alves @dfvalio and Johannes Breuer @jobreu. By participating in this peer review process, you are contributing to the creation of a useful and sustainable technical resource for the whole community. Thank you.

Please read the lesson, test the code, and post your review as a comment in this issue by July 22.

Reviewer Guidelines:

A preview of the lesson:

--
Notes:

  • All participants in this discussion are advised to read and be guided by our shared Code of Conduct.
  • Members of the wider community may also choose to contribute reviews.
  • All participants must adhere to our anti-harassment policy:

Anti-Harassment Policy

This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.

Programming Historian in English is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinise ideas, to ask questions, make suggestions, or request clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. If anyone witnesses or feels they have been the victim of the above described activity, please contact our ombudsperson Dr Ian Milligan. Thank you for helping us to create a safe space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 4 Open Peer Review
Development

No branches or pull requests

5 participants