Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Binder Button for JupyterLab and RStudio Server #286

Open
davidrpugh opened this issue Jul 8, 2019 · 11 comments
Open

Add Binder Button for JupyterLab and RStudio Server #286

davidrpugh opened this issue Jul 8, 2019 · 11 comments

Comments

@davidrpugh
Copy link

Is there interest in adding Binder buttons to the README? I have already added a Binder button to the Software Carpentry Python lesson and am in the progress of adding a Binder button to the Software Carpentry R lesson.

You would need to create two orphaned branches called binder-python and binder-r. I would then open a PR to add the files required to configure a Binder instance for JupyterLab and RStudio with all required packages/libraries pre-installed. I would then open a second PR to add two buttons to the README of the gh-pages branch.

Once complete, learners would be able to click to button for either Python or R and after a minute or two would have access to an instance of JupyterLab or RStudio server running in the cloud with all software pre-installed and ready to go for the lesson.

@remram44
Copy link
Contributor

remram44 commented Jul 8, 2019

I don't think there is much point using JupyterLab for this lesson, over the graphical program DB Browser for SQLite. It just adds unnecessary complexity to import data and run SQL code.

@davidrpugh
Copy link
Author

@remram44 Thanks for the feedback. If most learners where only interested in learning SQL for its own sake then I would agree with you.

However, in my experience most (all?) learners want to learn to use SQL from either Python or R and are most interested in the last two episodes which demonstrate how this is done. Providing links to Binder instances makes it easier for learners to follow along with the last two episodes in an environment that more closely mimics their day-to-day research environment (than say DB Browser); learners can also easily replicate the Binder environments on their local machine later should they wish to do so.

Since both JupyterLab and RStudio provide a terminal window I generally teach the first several SQL episodes using the terminal from within either JupyterLab or RStudio (depending on whether the majority of the learners are more interested in Python or R) and then switch to either notebooks or R scripts for the last two lessons. The ability to seamlessly switch from terminal to a Python or R work environment helps re-enforce the usefulness of learning SQL and integrating it into daily workflows.

Finally, by including the links in the README we are simply providing the option for instructors and learners to use the Binder instances if they wish.

@davidrpugh
Copy link
Author

To give everyone an idea of what this would look like I have created the two orphaned branches mentioned above and created a third branch off of gh-pages where I have added the buttons.

https://github.com/kaust-vislab/sql-novice-survey/tree/add-binder-buttons

@henrykironde
Copy link
Contributor

@remram44, I totally agree with what you have said, especially complexity in the maintenance of these branches. However, after looking at the sample, I think it could be useful for those people who have handled the first part of using the DB Browser.

If we decide to add this work to the project, I think we need to device a test mechanisms or make sure that everything(code and data) syncs with changes in the main repo and the main repo PR changes are tested against the notebooks infrastructure before a merge.
@davidrpugh thanks for the issue and the sample, let me know more regarding my concerns, I have not handled this kind of integration.

@davidrpugh
Copy link
Author

@henrykironde Thanks for your feedback. First, I can commit to maintaining those branches as the maintenance burden is low. The Binder service will automatically re-build the images when there are changes to those branches. Changes to the branches are most likely to occur as we version bump the dependencies over time.

The notebooks directory can/should be removed. Currently I have included some notebooks that replicate and extend the code in the Databases and Python episode but this is just for demonstration purposes. That material, if there is interest in including it, should go into its own PR for possible inclusion into the main lesson material. This way the JupyterLab instance and the RStudio instance both start with only the survey.db file and it would be up to individual instructors to determine how to leverage JupyterLab or RStudio as they see fit in their workshops. This also eliminates the need to sync any code between main lesson and the Binder instances (which would be burdensome!).

Hopefully that makes sense!

@remram44
Copy link
Contributor

remram44 commented Jul 9, 2019

The lesson as currently organized only deals with use from R or Python in the last chapter. Maybe only that one needs to be available from Binder then? This would make it a lot easier to maintain, and would avoid overloading the learners with this complicated environment (SQL in Python in Jupyter in Binder...).

There are also practical reasons not to conduct the full lesson in Binder. Exploring/manipulating the CSVs directly, through programs the learners is used to (Excel), would not be possible in such an environment.

@remram44
Copy link
Contributor

remram44 commented Jul 9, 2019

On a separate note, does this need to be a separate branch? It could probably be on gh-pages, provided the out-of-date notebooks are removed (to not cause confusion).

@davidrpugh
Copy link
Author

davidrpugh commented Jul 9, 2019

The lesson as currently organized only deals with use from R or Python in the last chapter. Maybe only that one needs to be available from Binder then? This would make it a lot easier to maintain, and would avoid overloading the learners with this complicated environment (SQL in Python in Jupyter in Binder...).

I think I may have created confusion by including the notebooks in the binder-python branch (and have now removed them). The branch should only contain the survey.db file and the files necessary to run JupyterLab on Binder (similar for binder-r branch and RStudio). Instructors should live code the last two lessons just as they normally would and for Python can choose between scripts or notebooks for the coding.

There are also practical reasons not to conduct the full lesson in Binder. Exploring/manipulating the CSVs directly, through programs the learners is used to (Excel), would not be possible in such an environment.

I think what is practical depends on the audience and the instructor and by providing the Binder instances we are providing additional options for instructors and learners.

There are two ways in which the JupyterLab and RStudio Binder instances could be used in teaching this lesson.

  1. Instructor teaches all but the last two episodes in the "normal" fashion but then uses the JupyterLab Binder instance to teach the Programming with Databases-Python lesson and/or uses the RStudio Binder instance to teach the Programming with Databases-R lesson.
  2. Instructor teaches the first 9 lessons using the terminal in either JupyterLab or R (depending) and then switches to using scripts/notebooks in JupyterLab or scripts in RStudio for the last two lessons.

I have tried both approaches and my audience(s) and I have both preferred option 2. When I teach the course, I use a Conda environment created on my local machine from the environment.yml file in the binder-python branch and then use JupyterLab as my teaching environment. The JupyterLab Binder instance allows learners to replicate my local environment without installing anything should they choose to do so (and to easily replicate this environment locally should they choose to do so). Similarly for R users and the RStudio Binder instance.

@davidrpugh
Copy link
Author

On a separate note, does this need to be a separate branch? It could probably be on gh-pages, provided the out-of-date notebooks are removed (to not cause confusion).

These branch do need be separate from the main branch for two reasons.

  1. If we moved the Binder config files, say for Python and JupyerLab, to the gh-pages branch, then when Binder builds the instance all of the files in the gh-pages branch will end up being copied into the working directory of the Binder instance which is undesirable as it causes confusion for learners.
  2. We need Binder instances for both JupyterLab and RStudio and you cannot put config files for multiple languages into a single branch.

@davidrpugh
Copy link
Author

I have added a JupyterLab extension called jupyterlab_sql which provides DB Browser like functionality to JupyterLab. I have not taught with this extension yet but will do so this coming semester.

After launching the binder instance you click the SQL launcher button and then provide the following url to connect to the SQLite DB.

sqlite:///data/survey.db

After that things look pretty similar to DB Browser (to me at least!). Let me know if the above doesn't work for you...

@davidrpugh
Copy link
Author

@remram44 @henrykironde Ping!

Prepping to teach these materials at an upcoming workshop and wanted to see if there is any interest in having Binder buttons to launch JupyterLab and RStudio with SQL on Binder.

The relevant branches can be found at the following links.

https://github.com/kaust-vislab/sql-novice-survey/tree/binder-python
https://github.com/kaust-vislab/sql-novice-survey/tree/binder-r

If you wish to include this material then you will need to create orphaned branches called binder-python and binder-r and then I will open the PRs. Once these PRs have been merged, then I will open a third PR to add the buttons to the README on the main gh-pages branch.

I use these branches whenever I teach this material and am thus happy to commit to maintain the binder-python and binder-r branches.

You can test drive the JupyterLab instance with SQL extensions installed here:

https://mybinder.org/v2/gh/kaust-vislab/sql-novice-survey/binder-python?urlpath=lab

You can test drive the RStudio instance with SQL extensions installed here:

https://mybinder.org/v2/gh/kaust-vislab/sql-novice-survey/binder-r?urlpath=rstudio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants