Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow download speed v214 #522

Open
kafker opened this issue Jun 2, 2023 · 14 comments
Open

Slow download speed v214 #522

kafker opened this issue Jun 2, 2023 · 14 comments

Comments

@kafker
Copy link

kafker commented Jun 2, 2023

GitHub issues are specifically for issues with the GTDB-Tk, please join us on the GTDB forum:

Dear devs,

not sure if this was caused by a network problem on my side or your side. I am trying to download the latest GTDB database:

wget https://data.gtdb.ecogenomic.org/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz

However, after 10 min or so the download drops to 100k per sec, making it impossible to download the database in a reasonable amount of time.

I tried different wireless connections (HPC or home) but nothing seems to work.

Thank you!
K

@konstantin-demin
Copy link

Same issue here. I have relatively high speed internet at home. Still, the download rate of GTDB db barely exceeds 200 kb/s.

@aaronmussig
Copy link
Member

Hello,

Thank you for raising this issue, I'll take a look into this.

Can you both let me know what speed you get when trying to download from the mirror? https://data.ace.uq.edu.au/public/gtdb/data/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz

I'm also in the process of applying for an additional quota to use Zenodo as a secondary mirror.

Cheers,
Aaron

@kafker
Copy link
Author

kafker commented Jun 4, 2023

Can you both let me know what speed you get when trying to download from the mirror? https://data.ace.uq.edu.au/public/gtdb/data/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz

Hi Aron,

The download from the mirror is much more stable.

The download speed was 4-7 MB/s

Thank you!
K

@konstantin-demin
Copy link

Can you both let me know what speed you get when trying to download from the mirror? https://data.ace.uq.edu.au/public/gtdb/data/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz

Hello Aron. The link you provided reaches the same speed as before, ~200 kb/s. But I was managed to download the db by switching to Windows and directly downloading it from the latest link in the list of releases here https://ecogenomics.github.io/GTDBTk/installing/index.html. From windows, the speed was 4-10 mb/s. I don't really know if the problem is in automatic download or in my Linux machine (facing no problems with any other downloads of any other thing anyway).

I think additional mirror wouldn't be bad.

Thanks for help!

@Sumsarium
Copy link

Hello,

Thank you for raising this issue, I'll take a look into this.

Can you both let me know what speed you get when trying to download from the mirror? https://data.ace.uq.edu.au/public/gtdb/data/releases/release214/214.0/auxillary_files/gtdbtk_r214_data.tar.gz

I'm also in the process of applying for an additional quota to use Zenodo as a secondary mirror.

Cheers, Aaron

I get 0.5-5 mb/s using the above link. That's about 10-20x faster than normal...

@aaronmussig
Copy link
Member

Sorry to hear about the slow speeds, I am still waiting on Zenodo to get back to me about additional storage.

In the meantime, I've developed a small program that will download the GTDB-Tk R214 reference database from the unarchived data. It's fault tolerant and will allow you to download with multiple threads.

If anyone who is experiencing slow download speeds would like to give it a go, please see: https://github.com/Ecogenomics/gtdbtk-db-download

I've got a few ideas that would be a bit more involved in speeding it up, i.e. namely downloading the fasta files from NCBI, but I'll only do that if this is still unusable.

@ValentinCledassou
Copy link

ValentinCledassou commented Aug 25, 2023

impossible to download the R214 database, it's too slow (20kb/sec)..... same for the mirror

@aaronmussig
Copy link
Member

I tested the download speed from Denmark and Australia and the download speed was at ~7MB/s. Nevertheless I rebooted NGINX, did it help?

@ValentinCledassou
Copy link

With a VPN for Australia, I have the same speed that you. But without Vpn (in France) it's always ~20kb/sec

@Sumsarium
Copy link

Mine starts at 8 mb/s but quickly drops down to around 300-500 kb/s. Generally seems to be a bit unstable wrt speed. I haven´t tested it via VPN. Not a big issue (for me at least) as long as the databases aren´t updated on a weekly basis...

@bheimbu
Copy link

bheimbu commented Apr 5, 2024

Hi @aaronmussig,

any news on this? The download speed from Germany is super slow, like 200 kb/s.

Cheers Bastian

@iwilkie
Copy link

iwilkie commented May 13, 2024

Hi,

Is there a solution for this issue when downloading r220? I've noticed that my download of the new release oscillates between 10 - 60 KB/s, and our IT department confirmed that it's not an issue from our side.

Thanks!

@Sumsarium
Copy link

This seems to be a persistent issue. It still takes me several days to download the databases (Denmark).

@iwilkie
Copy link

iwilkie commented May 13, 2024

It still takes me several days to download the databases (Denmark)

I'm in Germany and my download has been going for 10 days now... Have you tried using the VPN to Australia? Unfortunately I cannot test this from my work setup, but I wanted to give it a try when I get back to my personal computer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants