Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download speed reduces with large files (> ~100GB) #693

Open
himanshu-interra opened this issue Jan 29, 2024 · 1 comment
Open

Download speed reduces with large files (> ~100GB) #693

himanshu-interra opened this issue Jan 29, 2024 · 1 comment

Comments

@himanshu-interra
Copy link

Using s5cmd v2.2.1-be63977, I am noticing good throughput in the beginning of download (~900MBps) but it gradually reduces to 300MBps and stays on this speed.

Command used:
s5cmd.exe cp --sp --concurrency 8 "s3 file path" "local path"

If concurrency is not passed as a parameter, then download speed remains constant throughout the download (~280MBps). All this is being done on an EC2 instance with following configuration:

  • Instance Type: C6a.12xlarge
  • Volume Type: gp3
  • IOPS: 4000
  • Throughput: 1000MB/s
  • OS: Windows Server 2022

Am I using concurrency wrong? Or is there a bug in its implementation?

@kucukaslan
Copy link
Contributor

Technically you are not using it wrong and there is no bug (in the sense we usually mean).

according to aws docs gp3 have 64 KiB I/O size. with the concurrent downloads we are using random writes not sequential writes. So actual download speed is limited by the IOPS * I/O size that is 4000*64KiB or about 256 MiB.

aforementioned aws docs states IOPS limit as 16000. If possible, increasing IOPS limit to 16000 would increase write throughput up to 1000 MBps.

same problem (situation?) was also mentioned in #418 and #667 .
At some time I, naively, attempted to speed up by sequentializing writes, however I couldn't succeed. Probably the physical disk used by EBS Volume is shared by other people (and allocated EBS volume might be distributed to multiple physical disks), hence it does not seem possible to force sequential writes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants