Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient Write+Copy+Delete pattern when writing to S3. #1949

Open
apghml opened this issue Feb 27, 2024 · 1 comment
Open

Inefficient Write+Copy+Delete pattern when writing to S3. #1949

apghml opened this issue Feb 27, 2024 · 1 comment
Assignees

Comments

@apghml
Copy link

apghml commented Feb 27, 2024

When tf.io writes a file, it checks whether HasAtomicMove is true and if so, simulates atomic writes by first writing to a temporary file, then renaming the file to the correct name. This is great on a local filesystem. But for S3, this is undesirable behavior for a few reasons:

  1. S3 writes are already atomic, so there is no need to simulate one.
  2. In the AWS S3 SDK, which tf.io uses, moving a file is implemented as a copy+delete, which increases the load on S3 compared to a direct write.
@learning-to-play
Copy link

Hi @yongtang , This issue has is preventing some users from running their jobs on GPU by causing additional load. Could you help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants