Adding cellrangermulti subworkflow #276

fmalmeida · 2023-11-23T11:39:50Z

Close #247
Close #313

PR checklist

Context

Hi guys,

Although not finished yet because it would still required updating the parameters schema, defaults and documentation, I am already opening the PR so we can all take a look at it and discuss any modifications required before merging and also, give it a round of tests and define how we want some parameters to be.

I used the templates provided by @klkeys

Usage context

samplesheet
To use it, samplesheet requires an additional parameter so that we can properly mix the different feature types given per sample.

sample,fastq_1,fastq_2,feature_type,protocol,expected_cells
PBMC_10K,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5gex/subsampled_sc5p_v2_hs_PBMC_10k_5gex_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5gex/subsampled_sc5p_v2_hs_PBMC_10k_5gex_S1_L001_R2_001.fastq.gz,gex,SC5P-PE,1000
PBMC_10K,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/bcell/subsampled_sc5p_v2_hs_PBMC_10k_b_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/bcell/subsampled_sc5p_v2_hs_PBMC_10k_b_S1_L001_R2_001.fastq.gz,vdj,SC5P-PE,1000
PBMC_10K,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5fb/subsampled_sc5p_v2_hs_PBMC_10k_5fb_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc/fastqs/5gex/5fb/subsampled_sc5p_v2_hs_PBMC_10k_5fb_S1_L001_R2_001.fastq.gz,ab,SC5P-PE,1000
PBMC_10K_CMO,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/gex_1/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_gex_S2_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/gex_1/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_gex_S2_L001_R2_001.fastq.gz,gex,SC3Pv3,1000
PBMC_10K_CMO,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/cmo/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_multiplexing_capture_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/10k_pbmc_cmo/fastqs/cmo/subsampled_SC3_v3_NextGem_DI_CellPlex_Human_PBMC_10K_1_multiplexing_capture_S1_L001_R2_001.fastq.gz,cmo,SC3Pv3,1000
PBMC_10K_CMV,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/gex_1/subsampled_5k_human_antiCMV_T_TBNK_connect_GEX_1_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/gex_1/subsampled_5k_human_antiCMV_T_TBNK_connect_GEX_1_S1_L001_R2_001.fastq.gz,gex,SC5P-R2,1000
PBMC_10K_CMV,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/ab/subsampled_5k_human_antiCMV_T_TBNK_connect_AB_S2_L004_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/ab/subsampled_5k_human_antiCMV_T_TBNK_connect_AB_S2_L004_R2_001.fastq.gz,ab,SC5P-R2,1000
PBMC_10K_CMV,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/vdj/subsampled_5k_human_antiCMV_T_TBNK_connect_VDJ_S1_L001_R1_001.fastq.gz,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/10xgenomics/cellranger/5k_cmvpos_tcells/fastqs/vdj/subsampled_5k_human_antiCMV_T_TBNK_connect_VDJ_S1_L001_R2_001.fastq.gz,vdj,SC5P-R2,1000

Supporting files

Right now, all the supporting files have been added as parameters, for example, cmo_barcode_csv, beam_antigen_csv, etc. .... which means they will work in a dataset manner, being the same for everything given in the samplesheet, instead of samplesheet base if they were added as columns in the samplesheet.

My main question here is, what should it be the desired approach?

Other stuff
Of course there might still have things to clear or finish that I might have overlooked since there is quite a lot on it, so, I request your help on spotting it.

testing case required the full genomes from ensembl, otherwise, the analysis using VDJ was failling.

…ution

…pe-scrna

docs/usage.md

subworkflows/local/align_cellrangermulti.nf

grst · 2024-05-15T09:48:41Z

Ok. I think I'm done with the documentation.

docs/usage.md

grst · 2024-05-17T11:51:14Z

I don't think lint will pass before the template update is merged. However, since this is such massive PR, I'd do that after this one gets merged to dev to avoid further disruptions.

This reverts commit 03a38cd.

This reverts commit e22f986.

grst

Are we there then? I don't think there's anything pending for now.

fmalmeida · 2024-05-17T12:15:41Z

Are we there then? I don't think there's anything pending for now.

Hi @grst,

I think so. The last things were the documentation and the things you had brought.
There are a few TODOs but can be taken later. As the BEAM data.

This was the comment summarising it: #276 (comment)

We can have a last look on that, and open follow-ups, otherwise, it is.

fmalmeida · 2024-05-23T07:42:26Z

I have added a follow-up ticket for the open points: #332

fmalmeida and others added 29 commits November 7, 2023 11:31

Add cellranger multi testing assets

94397e5

allow cellrangermulti option

e6a54bb

include cellrangermulti testing conf/profile

ae20b9e

allow cellrangermulti option

5863d2c

fix example samplesheet

631a980

fixed samplesheet for cellranger multi

70fff60

don't get cellrangemulti metadata if not needed

f97e430

fix check_samplesheet script to be more generic

ff66e97

update input_check for cellranger multi

0af2761

avoid renaming sample ids in input check

8e7d436

generate a parsed input channel for cellrangermulti sub-workflow

97281aa

defined cellrangemulti sub-workflow and parsed input channel for exec…

8b19f48

…ution

included gex (normal) reference building and updated cellranger modules

6f494fe

include mkvdjref

215226b

refactored sample mapping

7b86f80

finally cellranger multi running, with errors, but now can be debugged

e6dc3b2

not finding samples in data directory

7c66115

saving quick changes for shifting development workspace

abb0e3c

include option for unzipping reference files

5e27a4c

First successfull run of cellranger multi with renaming module

e00a78d

add whiteline

ead6462

Testing github traffic

ab4425e

Remove file used for testing

2cfb148

input dataset parsing refactored and fixed

4c275b4

include cellrangermulti outputs in mqc channel

a8d2702

include option for cellrangermulti in mtx conversion modules

0d8be69

add files filter for cellranger multi outputs

4d75c83

include cellranger multi outputs to mtx conversion subworkflow

e4e37b5

update changelog

132e247

fmalmeida requested a review from apeltzer November 23, 2023 11:39

zxBIB Almeida,Felipe (GCBDS) EXTERNAL and others added 7 commits May 10, 2024 10:01

correct indentation

9a3e529

starting documentation on cellranger multi

f842cba

continue documentation

adfda0f

update documentation

0fce1c8

add section in outputs

dc63ae3

Merge remote-tracking branch 'origin/dev' into 247-support-for-10x-ff…

9bff0b5

…pe-scrna

Update usage

92502c9

grst reviewed May 15, 2024

View reviewed changes

docs/usage.md Show resolved Hide resolved

grst added 3 commits May 15, 2024 11:29

Fixed 'file-path' in nextflow schema

c22ad6f

Update output documentation

c95b11c

Update nextflow_schema.json

f4304ad

grst reviewed May 15, 2024

View reviewed changes

subworkflows/local/align_cellrangermulti.nf Show resolved Hide resolved

fmalmeida commented May 16, 2024

View reviewed changes

docs/usage.md Outdated Show resolved Hide resolved

fmalmeida added 2 commits May 16, 2024 08:26

remove gex_barcode_sample_assignment parameter

03a38cd

add note

e22f986

grst added 3 commits May 17, 2024 14:00

Update nextflow schema documentation

0d0275e

Revert "remove gex_barcode_sample_assignment parameter"

882812c

This reverts commit 03a38cd.

Revert "add note"

708c903

This reverts commit e22f986.

grst approved these changes May 17, 2024

View reviewed changes

apeltzer approved these changes May 22, 2024

View reviewed changes

maxulysse added 3 commits May 22, 2024 12:51

Merge branch 'dev' into 247-support-for-10x-ffpe-scrna

b3afdb7

update file

d497ca8

update file better

4d9f17e

maxulysse merged commit 7286aa6 into dev May 22, 2024
13 checks passed

maxulysse deleted the 247-support-for-10x-ffpe-scrna branch May 22, 2024 17:35

fmalmeida mentioned this pull request May 23, 2024

cellranger-multi implementation follow-up #332

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding cellrangermulti subworkflow #276

Adding cellrangermulti subworkflow #276

fmalmeida commented Nov 23, 2023 •

edited by grst

grst commented May 15, 2024

grst commented May 17, 2024

grst left a comment

fmalmeida commented May 17, 2024

fmalmeida commented May 23, 2024

Adding cellrangermulti subworkflow #276

Adding cellrangermulti subworkflow #276

Conversation

fmalmeida commented Nov 23, 2023 • edited by grst

PR checklist

Context

grst commented May 15, 2024

grst commented May 17, 2024

grst left a comment

Choose a reason for hiding this comment

fmalmeida commented May 17, 2024

fmalmeida commented May 23, 2024

fmalmeida commented Nov 23, 2023 •

edited by grst