Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pancan_pcawg_2020 and pan_origimed_2020 #62

Open
mjsteinbaugh opened this issue Sep 15, 2022 · 4 comments
Open

pancan_pcawg_2020 and pan_origimed_2020 #62

mjsteinbaugh opened this issue Sep 15, 2022 · 4 comments

Comments

@mjsteinbaugh
Copy link

Hi Waldron Lab,

I'm working on migrating my cBioPortal workflow code to use cBioPortalData, and the package is really excellent. Great work. One thing that I've noticed is that pancan_pcawg_2020 doesn't appear to be supported by the main cBioPortalData() or cBioDataPack() functions.

I checked using this code:

api <- cBioPortalData:::.loadReportData()[["api_build"]]
pack <- cBioPortalData:::.loadReportData()[["pack_build"]]

See related dataset:
https://www.cbioportal.org/study/summary?id=pancan_pcawg_2020

I'm happy to help add support for this dataset if you guys can walk me through it. One other question I have is what if the package provided download support for pre-processed MultiAssayExperiment objects instead of the pack file approach? Is that doable?

Best,
Mike

@mjsteinbaugh
Copy link
Author

Also, the pan_origimed_2020 dataset would be a really helpful addition.

https://www.cbioportal.org/study/summary?id=pan_origimed_2020

@mjsteinbaugh
Copy link
Author

Found a minor bug with BiocFileCache call -- usage of cBioDataPack() with ask = FALSE still currently prompts the user to create the cBioPortalData BiocFileCache directory if it doesn't exist.

@mjsteinbaugh mjsteinbaugh changed the title pancan_pcawg_2020 pancan_pcawg_2020 and pan_origimed_2020 Sep 15, 2022
@LiNk-NY
Copy link
Contributor

LiNk-NY commented Sep 15, 2022

Hi Michael, @mjsteinbaugh
I've tested so far with pancan_pcawg_2020 and it looks like only the mutation data can be represented.

cBioDataPack("pancan_pcawg_2020", check_build = FALSE)
Study file in cache: pancan_pcawg_2020
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_cna.txt
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_mirna_zscores.txt
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_mirna.txt
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_mrna_seq_fpkm_zscores_ref_all_samples.txt
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_mrna_seq_fpkm.txt
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_mutations.txt
Working on: /tmp/RtmpO7UP1B/bb3743edba3_pancan_pcawg_2020/pancan_pcawg_2020/data_timeline_status.txt
harmonizing input:
  removing 18 colData rownames not in sampleMap 'primary'
A MultiAssayExperiment object of 1 listed
 experiment with a user-defined name and respective class.
 Containing an ExperimentList class object of length 1:
 [1] mutations: RaggedExperiment with 382937 rows and 2683 columns
Functionality:
 experiments() - obtain the ExperimentList instance
 colData() - the primary/phenotype DataFrame
 sampleMap() - the sample coordination DataFrame
 `$`, `[`, `[[` - extract colData columns, subset, or experiment
 *Format() - convert into a long or wide DataFrame
 assays() - convert ExperimentList to a SimpleList of matrices
 exportClass() - save data to flat files

You can download the data manually using downloadStudy and take a look at the contents.
I am not sure why the CNA and other datasets are not being built. I will take a closer look later.

Use version 2.9.11 or greater.

Best,
Marcel

@mjsteinbaugh
Copy link
Author

Thanks @LiNk-NY I'll take a look and get back to you. I'm primarily interested in the CNA data for both datasets, which I can query via the API but not the main recommended functions in the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants