Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsidc_icesat2_sync skips files it should download #33

Open
mrsiegfried opened this issue Jul 5, 2021 · 1 comment
Open

nsidc_icesat2_sync skips files it should download #33

mrsiegfried opened this issue Jul 5, 2021 · 1 comment

Comments

@mrsiegfried
Copy link
Contributor

Hi Tyler,

It looks like the test nsidc_icesat2_sync uses to check whether or not the remote file at NSIDC should be downloaded isn't sufficient. The current test in the http_pull_file function (ignoring the clobber flag) is just a comparison of the file's modification time: if the local file is newer, leave it. There is a case where the script breaks mid-download before the os.utime line can reset the local file's modification time to that of the remote file. In this case, the (corrupt) local file will have a modification time of when the script broke and so it will not be replaced upon re-running nsidc_icesat2_sync.

I didn't delve into the XML file that is being parsed in nsidc_list, so I don't know what other parameters are available for the test in http_pull_file, but replacing the file modification time test with a checksum (or even just file size) would catch this issue (and any other potential download issues). An easy fix potentially, but I didn't have a moment to check the XML tree and this might have impacts elsewhere in the repo, so opening it as an issue for a bit of discussion.

Matt

@tsutterley
Copy link
Owner

These are good points. I added the option --checksum to the sync program in PR #34. It doesn't compare with any hash in the xml file (I couldn't find one in the file I searched but I might have missed it). Instead, it checks the hash of a file that exists in the file system and the one it downloads. The problem is that this method will be quite slow in comparison since it has to download every file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants