Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit size of archives put to HSI #2

Open
3 of 4 tasks
kc9jud opened this issue Oct 4, 2020 · 1 comment
Open
3 of 4 tasks

Limit size of archives put to HSI #2

kc9jud opened this issue Oct 4, 2020 · 1 comment

Comments

@kc9jud
Copy link
Contributor

kc9jud commented Oct 4, 2020

According to the NERSC HPSS documentation (https://docs.nersc.gov/filesystems/archive/#avoid-very-large-files), files over 2TB are inefficient when put to HPSS. They recommend breaking files up into 500GB chunks if they get over that limit.

The hsi handler mcscript.task.archive_handler_hsi() should:

  • Inspect the size of an archive file to put.
  • Put the file directly using hsi if smaller than threshold, or
  • use split to break up the file into smaller segments and put those with hsi.

In addition (so that consumers of archives don't need to be aware of this splitting behavior):

  • mcscript should also provide a wrapper function to fetch and reassemble archives,
@kc9jud
Copy link
Contributor Author

kc9jud commented Oct 9, 2020

c849af4 implements the logic in archive_handler_hsi()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant