Skip to content

Latest commit

 

History

History
50 lines (35 loc) · 2.73 KB

4. JupyterNotebookOnBeluga.md

File metadata and controls

50 lines (35 loc) · 2.73 KB

Using Jupyter Notebook on Beluga

Contrarily to other Compute Canada Cluster, Beluga is relatively new and no visualization interfaces are available on Beluga. That means that if you want to observe your result/plots, this won't be possible.

You could also download your results on your computer and run-it, but this would take a lot of storage memory.

From my current knowledge, using Jupyter is the most straight forward way to visualize figures on big datasets from Compute Canada.

It also allows to run simple commands, as you would on your terminal, without having to use the sbatch every time This is possible because you need to allocate memory (processing power) to your jupyter notebook as you open it (so theorically it is a running job like the sbatch).

First you need to have the proper installations. You will in this case make use of the virtual environmenbt that you create on the GitPage #1.

We're gonna go our directory of interest and then activate our virtual environment.

cd home/username/scratch
module load python/3.7.4
source ENV/bin/activate

You should notice that now the name of your command line has changed.

You will also need to install in this environment the necessary jupyter notebook installations. Note that all the relevant info can be found here: https://docs.computecanada.ca/wiki/Jupyter

First, we're going to download jupyter. Then, we'll create a wrapper scrit that lauches our Notebook. Finally, we're going to make the script executable.

pip install jupyter
echo -e '#!/bin/bash\nunset XDG_RUNTIME_DIR\njupyter notebook --ip $(hostname -f) --no-browser' > $VIRTUAL_ENV/bin/notebook.sh
chmod u+x $VIRTUAL_ENV/bin/notebook.sh

Now, our installations is complete. As I mentionned before, we will need to open the notebook as we would when we run jobs; by allocating memory and time.

Let's have a basic command that allows us to use our notebook for 3 hours and that uses 2 GB of storage.

alloc --time=3:0:0 --ntasks=1 --cpus-per-task=2 --mem-per-cpu=2GB --account=def-YOUR.PI srun $VIRTUAL_ENV/bin/notebook.sh 

This will take a few minutes, depending on your priority (how much you've used the cluster). You will obtain an URL. If you copy/paste this URL in your browser, you will notice it is not actually working. Why? Because Compute Canda has no way to connect your browser to your cluster.

To do that, you will need to create some sort of VPN whch you can connect to from your computer (Not sure how to phrase it). Open a new terminal from your local computer. If you have a MacOS, type:

pip install sshuttle
sshuttle --dns -Nr <username>@beluga.computecanada.ca

If you have a windows, I would STRONGLY advise to get a linux environment.

Now, go back to your URL, and it should be working.