Skip to content

Commit

Permalink
Merge pull request #27 from telatin/main
Browse files Browse the repository at this point in the history
Update vscode + Add nf-core
  • Loading branch information
rpoplawski0 committed Mar 19, 2024
2 parents 4224c57 + f003890 commit b93ed5f
Show file tree
Hide file tree
Showing 6 changed files with 121 additions and 8 deletions.
Binary file added docs/img/nf-fetch-run.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 5 additions & 1 deletion docs/index.md
Expand Up @@ -47,5 +47,9 @@ A simple walk-through of some CLIMB-BIG-DATA functionality.
[QIIME 2](walkthroughs/qiime2.md)
How to install QIIME 2 on a notebook server and basic usage.

[nf-core pipelines](walkthroughs/nfcore.md)
How to run some of the nf-core pipelines on CLIMB notebooks

[How to fix login error 403](notebook-servers/403-forbidden-error.md)
An explanation of how to resolve login error 403 when accessing notebooks.
An explanation of how to resolve login error 403 when accessing notebooks.

4 changes: 2 additions & 2 deletions docs/notebook-servers/index.md
Expand Up @@ -20,8 +20,8 @@ How to install software using Conda, in the context of a containerized environme
[Using Nextflow](using-nextflow.md)
How to use Nextflow with CLIMB-BIG-DATA.

[Using VS Code](using-vscode.md)
How to connect to your CLIMB Notebook using VS Code
[Using Visual Studio Code](using-vscode.md)
How to connect to your CLIMB Notebook and work from Visual Studio Code

[How to fix login error 403](403-forbidden-error.md)
An explanation of how to resolve login error 403 when accessing notebooks.
24 changes: 21 additions & 3 deletions docs/notebook-servers/using-vscode.md
@@ -1,7 +1,12 @@
# Editing with VS Code
# Working on your notebook from Visual Studio Code

Visual Studio Code, or VS Code, is a very popular [IDE](https://aws.amazon.com/what-is/ide/).
In this tutorial we will see how to use VS code to connect to your CLIMB Jupyter Notebook
In this tutorial we will see how to use VS code to connect to your CLIMB Jupyter Notebook.

This tutorial shows you how you can configure your CLIMB BIG DATA notebook to accept a connection from
your local installation of Visual Studio Code. In this way you will be able to use the code editor, the terminal, and the drag-n-drop file bar of Visual Studio Code instead of the - great but limited - web interface.

This is a tutorial for advanced users aiming at integrating their workflow with their CLIMB BIG DATA notebook.

<!-- prettier-ignore -->
!!! Prerequisites
Expand All @@ -11,6 +16,11 @@ In this tutorial we will see how to use VS code to connect to your CLIMB Jupyter

## Install "Tunnels"

<!-- prettier-ignore -->
!!! Further reading
The enabling technology of this tutorial is described in the [Developing with Remote Tunnels](https://code.visualstudio.com/docs/remote/tunnels) page of Visual Studio Code documentation.
Note that the section *How can I ensure I keep my tunnel running?* will not work on notebooks.

Inside your local VS Code, install the extension `Remote - Tunnels` by Microsoft.

After you install it, go to your *Remotes* tab and login using your GitHub account (you should see a *Sign in using your GitHub account* item in the menu).
Expand Down Expand Up @@ -71,4 +81,12 @@ Open this link in your browser https://vscode.dev/tunnel/jupyter-telatin-2enxf

When you are done, you can either click on the link provided on your terminal, or refresh
your tunnels list in your *local* VS Code, and as shown in the image above, you should see the
`jupyter-groupname-id` (or custom name you gave)
`jupyter-groupname-id` (or custom name you gave)

## What you can do now

1. Your Visual Studio Code **terminal** will now display your CLIMB terminal: you will find the paths and the conda environments of your notebook, and the executions will happen on your notebook.
2. Your file navigation will show you your CLIMB files, and you will be able to download and upload files to your notebook dragging and dropping files from the left sidebar
3. Most notably, your code editor will be Visual Studio Code, you will have the syntax highlighting, multi-edit, plug-ins and other features of Visual Studio Code to edit and visualise your CLIMB BIG DATA files.


90 changes: 90 additions & 0 deletions docs/walkthroughs/nfcore.md
@@ -0,0 +1,90 @@
# Running nf-core pipelines

## What are nf-core pipelines?

[nf-core](https://nf-co.re/) is an organisation backing an international effort to create high-quality,
reproducible pipelines written in [Nextflow](https://nextflow.io/).

Some examples of nf-core pipelines include:

* [nf-core/fetchngs](https://nf-co.re/fetchngs/): to download raw datasets from public repositories (ENA, SRA...)
* [nf-core/rnaseq](https://nf-co.re/rnaseq/): to perform a differential expression analysis of RNA-Seq datasets
* [nf-core/ampliseq](https://nf-co.re/ampliseq/): to analyse metabarcoding (16S, ITS...) experiments (mostly based on Qiime2)
* [nf-core/taxprofiler](https://nf-co.re/taxprofiler/): to run multiple taxonomy profiling tools on a metagenomics dataset
* [nf-core/mag](https://nf-co.re/mag/): to assemble and bin whole metagenome sequencing runs

See the full list [online](https://nf-co.re/pipelines).

## How to run a nf-core pipeline?

There is a very good [documentation](https://nf-co.re/docs) available from the nf-core website, and
even a great set of video tutorials.

A first attempt of running a pipeline should be using its *test* profile. This means that the pipeline will
try to analyse some test data known to work, and after getting a successful ending we can go further and try with our own data.

The general syntax is:
```text
nextflow run nf-core/<pipeline_name> -r <version> -profile test --outdir /shared/team/<output-dir>
```

Where:

* `<pipeline_name>` is of course the actual pipeline you want to run
* `<version>` is the revision you want to use (this is important and will ensure reproducibility, check the pipeline website to see the last version)
* `<output-dir>` where Nextflow will save the files. **NOTE** that your home directory will not work!

For example, to test the `rnaseq` pipeline:

```console
nextflow run nf-core/rnaseq -r 3.14.0 -profile test --outdir /shared/team/test-out-rnaseq
```

## An example: fetchngs

`nf-core/fetchngs` is a pipeline to download a set of NGS output from public repositories such as [NCBI Short Reads Archive](https://www.ncbi.nlm.nih.gov/sra).

We can use it as a first example pipeline as its input is a simple text file with a list of accession codes.

Remembering that Nextflow pipelines will not have access to any file saved in your home directory, we can create an input file like:

```console
mkdir -p /shared/team/download-lists/
echo -e "ERR12319563\nERR12319484\nERR12319547" > /shared/team/download-lists/test.csv
```

<!-- prettier-ignore -->
!!! Edit the list
The `echo` command created a list with three accession numbers from the command line,
but you can use the handy text editor built-in in the CLIMB notebook to create a new file.
It's important to use the `csv` extension though.

```bash
# The \ in the command allows to break a command in multiple lines
# If you type the command in a single line, do NOT type the "\"s

nextflow run nf-core/fetchngs -r 1.12.0 \
--input /shared/team/download-lists/test.csv \
--outdir /shared/team/fetchngs-out/
```

Example execution:

<img src="../img/nf-fetch-run.png" alt="nf-core fetchngs execution" height="500">

## S3 buckets

A very handy feature of Nextflow, is that it can read and write to S3 buckets.

If we want to save the output of the nf-core/fetchngs pipeline to a CLIMB S3 bucket
(suppose you have a bucket called "ngs-files"),
we can simply change the output path to something like:

```bash
# The \ in the command allows to break a command in multiple lines
# If you type the command in a single line, do NOT type the "\"s

nextflow run nf-core/fetchngs -r 1.12.0 \
--input /shared/team/download-lists/test.csv \
--outdir s3://ngs-files/fetchngs-output/
```
5 changes: 3 additions & 2 deletions mkdocs.yml
Expand Up @@ -50,10 +50,11 @@ nav:
- "Understanding storage": "storage/index.md"
- "Installing software with Conda": "notebook-servers/installing-software-with-conda.md"
- "Using Nextflow": "notebook-servers/using-nextflow.md"
- "Using VS Code": "notebook-servers/using-vscode.md"
- "Using Visual Studio Code": "notebook-servers/using-vscode.md"
- "403 Forbidden Error": "notebook-servers/403-forbidden-error.md"
- "Walkthroughs":
- "Metagenomics": "walkthroughs/metagenomics-tutorial.md"
- "Genome assembly": "walkthroughs/genome-assembly/spades.md"
- "Custom Nextflow Workflows": "walkthroughs/nextflow-custom-workflows/nextflow-custom.md"
- "QIIME 2": "walkthroughs/qiime2.md"
- "QIIME 2": "walkthroughs/qiime2.md"
- "nf-core pipelines": "walkthroughs/nfcore.md"

0 comments on commit b93ed5f

Please sign in to comment.