Skip to content

Commit

Permalink
improved Docker notes and removed uneeded LOs
Browse files Browse the repository at this point in the history
  • Loading branch information
ttimbers committed Feb 6, 2024
1 parent f14b0cc commit cb0be55
Show file tree
Hide file tree
Showing 61 changed files with 898 additions and 5,953 deletions.
Binary file removed docs/_images/Makefile.png
Binary file not shown.
Binary file removed docs/_images/art_of_ds_cycle.png
Binary file not shown.
Binary file removed docs/_images/business_suit.gif
Binary file not shown.
Binary file removed docs/_images/commit-visit.png
Binary file not shown.
Binary file removed docs/_images/commits.png
Binary file not shown.
Binary file removed docs/_images/data-science-workflow.png
Binary file not shown.
Binary file removed docs/_images/docker-hub-eg.png
Binary file not shown.
Binary file removed docs/_images/dockerfile.png
Binary file not shown.
Binary file removed docs/_images/imp_life_ds.png
Binary file not shown.
Binary file removed docs/_images/inbox-notification.png
Binary file not shown.
Binary file removed docs/_images/issue_thread.png
Binary file not shown.
Binary file removed docs/_images/open_issues.png
Binary file not shown.
Binary file removed docs/_images/pipeline.png
Binary file not shown.
Binary file removed docs/_images/release-visit.png
Binary file not shown.
Binary file removed docs/_images/release_eg.png
Binary file not shown.
1,072 changes: 1 addition & 1,071 deletions docs/_sources/materials/lectures/01-intro-to-ds-workflows-Copy1.ipynb

Large diffs are not rendered by default.

21 changes: 14 additions & 7 deletions docs/_sources/materials/lectures/05-containerization-1.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,13 @@
"```\n",
"FROM continuumio/miniconda3\n",
"\n",
"# Install Git, the nano-tiny text editor and less (needed for R help)\n",
"RUN apt-get update && \\\n",
" apt-get install --yes \\\n",
" git \\\n",
" nano-tiny \\\n",
" less\n",
"\n",
"# Install Jupyter, JupterLab, R & the IRkernel\n",
"RUN conda install -y --quiet \\\n",
" jupyter \\\n",
Expand All @@ -404,13 +411,6 @@
"# Make port 8888 available for JupyterLab\n",
"EXPOSE 8888\n",
"\n",
"# Install Git, the nano-tiny text editor and less (needed for R help)\n",
"RUN apt-get update && \\\n",
" apt-get install --yes \\\n",
" git \\\n",
" nano-tiny \\\n",
" less\n",
"\n",
"# Copy JupyterLab start-up script into container\n",
"COPY start-notebook.sh /usr/local/bin/\n",
"\n",
Expand All @@ -431,6 +431,13 @@
"\n",
"*Question: What images does it build off?*"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
106 changes: 94 additions & 12 deletions docs/_sources/materials/lectures/05-containerization-2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,13 @@
"- Type the command below to run the container again, exit it and prove to yourself that the container was deleted (but not the image!):\n",
"\n",
"```\n",
"docker run -it --rm continuumio/miniconda3:23.9.0-0\n",
"```"
"docker run \\\n",
" --rm \\\n",
" --it \\\n",
" continuumio/miniconda3:23.9.0-0\n",
"```\n",
"\n",
"> *Note: we are using `\\` above to split a bash command across lines to make it more readable. YOu will see that throughout this chapter.*"
]
},
{
Expand All @@ -154,7 +159,11 @@
"To mount our current directory to a container from the `continuumio/miniconda3` image we type the following on your laptop:\n",
"\n",
"```\n",
"docker run -it --rm -v /$(pwd):/home/my_mounted_volume continuumio/miniconda3\n",
"docker run \\\n",
" --rm \\\n",
" -it \\\n",
" -v /$(pwd):/home/my_mounted_volume \\\n",
" continuumio/miniconda3\n",
"```\n",
"\n",
"Navigate to the directory where you mounted your files via: `cd /home/my_mounted_volume` and type `ls` to ensure you can see them.\n",
Expand All @@ -176,13 +185,19 @@
"\n",
"[Docker documentation on Container networking](https://docs.docker.com/config/containers/container-networking/)\n",
"\n",
"If we want to use a graphical user interface (GUI) with our containers, for example to be able to use the computational environment in the container in an integrated development environment (IDE) such as RStudio or JupyterLab, then we need to map the correct port from the container to a port on our computer.\n",
"If we want to use a graphical user interface (GUI) with our containers, for example to be able to use the computational environment in the container in an integrated development environment (IDE) such as RStudio or JupyterLab, then we need to map the correct port from the container to a port on our computer. \n",
"\n",
"> Note: In computer science, ports are points where network connections start and end. They can be physical (e.g., USB ports, Ethernet ports, etc) or virtual. In the case of virtual ports, they are really a software-based addressing mechanism that identifies points to connect specific processes or types of network services. When we are discussing ports in the context of containerization, we are referring to virtual ports.\n",
"\n",
"To do this, we use the `-p` flag with `docker run`, specifying the port in the container on the left-hand side, and the port on your computer (the container/Docker host) on the right-hand side of `:`. For example, to run the `rocker/rstudio` container image we would type `-p 8787:8787` to map the ports as shown in the `docker run` command below:\n",
"\n",
"\n",
"```\n",
"docker run --rm -p 8787:8787 -e PASSWORD=\"apassword\" rocker/rstudio:4.3.2\n",
"docker run \\\n",
" --rm \\\n",
" -p 8787:8787 \\\n",
" -e PASSWORD=\"apassword\" \\\n",
" rocker/rstudio:4.3.2\n",
"```\n",
"\n",
"Then to access the web app, we need to navigate a browser url to `http://localhost:<COMPUTER_PORT>`. In this case we would navigate to <http://localhost:8787> to use the RStudio server web app from the container.\n",
Expand All @@ -191,7 +206,11 @@
"our computer (the container/Docker host) has many ports we can choose from to map. So if we wanted to run a second `rocker/rstudio` container, then we could map it to a different port as shown below:\n",
"\n",
"```\n",
"docker run --rm -p 8788:8787 -e PASSWORD=\"apassword\" rocker/rstudio:4.3.2\n",
"docker run \\\n",
" --rm \\\n",
" -p 8788:8787 \\\n",
" -e PASSWORD=\"apassword\" \\\n",
" rocker/rstudio:4.3.2\n",
"```\n",
"\n",
"When we do this, to run the app in a browser on our computer, we need to go to <http://localhost:8788> (instead of <http://localhost:8787>) to access this container as we mapped it to the `8788` port on our computer (and not `8787`).\n",
Expand All @@ -204,6 +223,40 @@
"*Source: <https://hub.docker.com/r/rocker/rstudio>*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Specifying the image architecture/platform\n",
"\n",
"Newer M1 and M2 Macs use a new processor chip, called ARM,\n",
"that is a different architecture compared to the previous\n",
"Macs, and current Windows and Linux machines (which use Intel Processors).\n",
"Given that containerization software virtualizes\n",
"at the level of the operating system user space,\n",
"these different architectures lead to building containers with different architectures.\n",
"\n",
"Also given that Newer M1 and M2 Macs are still the minority of computers \n",
"in use, it is a better practice to work with container architectures that \n",
"work for the majority of in use computers, which are those that have Intel Processors.\n",
"To tell Docker to do this, \n",
"we add the `--platform=linux/amd64` argument to our Docker `run` and `build` \n",
"commands. \n",
"\n",
"To make this process even smoother and less error prone,\n",
"we should also set our Docker Desktop \n",
"to use Rosetta 2 x86/AMD64 emulation on M1/M2 Macs .\n",
"To use this, you must:\n",
"- make sure Rosetta 2 is installed on your Mac (instructions to install it [here](https://support.apple.com/en-ca/HT211861))\n",
"- Select \"Use Virtualization framework\" and \"Use Rosetta for x86/amd64 emulation on Apple Silicon\" in the General settings tab of Docker Desktop.\n",
"\n",
"*Note 1: In computer science, emulation works to let you run run software and execute programs originally designed one computer system on another computer system. Emulation is similar to virtualization in concept, but differs from it in that it focuses on enabling software designed for entirely different architectures to be executed.*\n",
"\n",
"*Note 2: You must also be using macOS Ventura or later to use this feature.*\n",
"\n",
"*Note 3: You will still need to use the `--platform linux/amd64` command when building or running images even when using Rosetta 2 emulation, because your computer can run and build both `linux/arm64` and `linux/amd64` images. So you have to be clear which architecture you want to work with.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -226,7 +279,11 @@
"we would run:\n",
"\n",
"```\n",
"docker run --rm -it rocker/rstudio:4.3.2 bash\n",
"docker run \\\n",
" --rm \\\n",
" -it \\\n",
" rocker/rstudio:4.3.2 \\\n",
" bash\n",
"```\n",
"\n",
"Notice the command above does not specify the ports,\n",
Expand All @@ -244,14 +301,17 @@
"The general form for for running things non-interactively is this:\n",
"\n",
"```\n",
"docker run --rm -v PATH_ON_YOUR_COMPUTER:VOLUME_ON_CONTAINER DOCKER_IMAGE PROGRAM_TO_RUN PROGRAM_ARGUMENTS\n",
"docker run \\\n",
" --rm \\\n",
" -v PATH_ON_YOUR_COMPUTER:VOLUME_ON_CONTAINER DOCKER_IMAGE PROGRAM_TO_RUN \\\n",
" PROGRAM_ARGUMENTS\n",
"```\n",
"\n",
"What of instead running the container insteractively, we wanted to run a script? \n",
"Let's take this R script, named `snowman.R`, shown below, \n",
"which uses the `cowsay::say` function to print some asci art with a cute message! \n",
"\n",
"```{R}\n",
"```\n",
"# snowman.R\n",
"\n",
"library(cowsay)\n",
Expand Down Expand Up @@ -286,9 +346,31 @@
"\n",
"Now that was a silly example, but this can be made powerful so that we can run an analysis pipeline, such as a `Makefile` non-interactively using Docker! \n",
"\n",
"Let's do this exercise to demonstrate:\n",
"\n",
"1. Clone this GitHub repository: <https://github.com/ttimbers/breast_cancer_predictor_py>\n",
"\n",
"2. Navigate into the root of the `breast_cancer_predictor_py` project on your computer using the command line and enter the following command to reset the project to a clean state (i.e., remove all files generated by previous runs of the analysis): \n",
"\n",
"```\n",
"docker run --rm -v /$(pwd):/home/rstudio/data_analysis_eg ttimbers/data_analysis_pipeline_eg make -C /home/rstudio/data_analysis_eg all\n",
"```"
"docker run \\\n",
" --rm \\\n",
" -v .:/home/jovyan \\\n",
" ttimbers/breast_cancer_predictor_py:d285fc9 \\\n",
" make clean\n",
"```\n",
"\n",
"3. To run the analysis in its entirety, enter the following command in the terminal in the project root:\n",
"\n",
"```\n",
"docker run \\\n",
" --rm \\\n",
" -v .:/home/jovyan \\\n",
" ttimbers/breast_cancer_predictor_py:d285fc9 \\\n",
" make all\n",
"```\n",
"\n",
"Note: If you are on a M1/M2 Mac, don't forget to include `--platform=linux/amd64` in your run command."
]
},
{
Expand Down Expand Up @@ -421,7 +503,7 @@
"(in the example below we use `make` to run the data analysis pipeline script).\n",
"\n",
"```\n",
"docker-compose run --rm analysis-env make -C /home/rstudio/breast_cancer_predictor all\n",
"docker-compose run --rm analysis-env make all\n",
"```"
]
}
Expand Down

0 comments on commit cb0be55

Please sign in to comment.