Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] aspera #204

Open
NomiCentarix opened this issue Nov 26, 2023 · 3 comments
Open

[BUG] aspera #204

NomiCentarix opened this issue Nov 26, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@NomiCentarix
Copy link

Following my previous issue - I still don't get the fastq files with aspera, only empty folders, with the following code:

from pysradb.sraweb import SRAweb
SRA_OUR_DIR = "/data/NCBI_data/"
db = SRAweb()
gse_to_srp = db.gse_to_srp("GSE226189")
print("gse_to_srp shape:", gse_to_srp.shape)
display(gse_to_srp.head(2))

metadata = db.sra_metadata(gse_to_srp["study_accession"].to_list(), detailed=True)
print(metadata.shape)
display(metadata.head(2))

db.download(df=metadata.head(1), 
            url_col="ena_fastq_http_1",
            use_ascp=True,
            #threads=8,
            skip_confirmation=True,#don't ask for permmision to download
            out_dir=SRA_OUR_DIR)  

OS: AWS EC2, Ubuntu 22.04.2 LTS
anaconda3 Python 3.11.5

when the url_col is the default I do get the .sra files.
The link in column "ena_fastq_http_1" seems fine
(http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR236/077/SRR23630177/SRR23630177_1.fastq.gz)

@NomiCentarix NomiCentarix added the bug Something isn't working label Nov 26, 2023
@saketkc
Copy link
Owner

saketkc commented Dec 6, 2023

Thanks for catching this. I think there is a bug in the download module. For now I would recommend saving the metadata in a csv using pysradb metadata --detailed <SRP> --saveto x.tsv and using a tool like curl/wget to download files from the *_url column

@NomiCentarix
Copy link
Author

thanks for the answer. So I should use curl/wget without aspera, right?

@NomiCentarix
Copy link
Author

NomiCentarix commented Dec 9, 2023

Ok now I have a strange problem - I don't get the fastq's URLs anymore!
The columns "ena_fastq_http", "ena_fastq_http" and "ena_fastq_http" are all NA.
I tested the code in several environments, and no change.
(the data does exist in the same path as before
http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR236/077/SRR23630177/SRR23630177_1.fastq.gz)

Do you have any idea what happened?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants