Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to skip sync/copy 0-byte files (e.g. --min-size) #2593

Open
mweibel opened this issue Feb 27, 2024 · 2 comments
Open

Add flag to skip sync/copy 0-byte files (e.g. --min-size) #2593

mweibel opened this issue Feb 27, 2024 · 2 comments

Comments

@mweibel
Copy link

mweibel commented Feb 27, 2024

Which version of the AzCopy was used?

10.22.1

Which platform are you using? (ex: Windows, Mac, Linux)

Linux

What command did you run?

azcopy sync / azcopy copy

What problem was encountered?

Application writing the files writes first 0 byte files first before writing the actual file. The application is written by a third party and not under our control.

How can we reproduce the problem in the simplest way?

create a 0 byte file and sync/copy it using azcopy.

Have you found a mitigation/solution?

It would be great if azcopy supported a way to skip 0 byte files using e.g. a new flag --min-size or similar. Both for copy and sync.

@siminsavani-msft
Copy link
Member

siminsavani-msft commented Feb 27, 2024

Hi @mweibel ! Thank you for sending a feature request, we will keep you posted if we choose to pick this up. In the meantime, I would like to suggest a few workarounds that may be helpful for your scenario. Since you want to exclude specific files, you may be able to leverage the --exclude-* flags we have, seen in this copy documentation and sync documentation.

In particular, I believe these flags will be the most useful:
--exclude-path (string) Exclude these paths when copying. This option doesn't support wildcard characters (*). Checks relative path prefix(For example: myFolder;myFolder/subDirName/file.pdf). When used in combination with account traversal, paths don't include the container name.

--exclude-pattern (string) Exclude these files when copying. This option supports wildcard characters (*)

--exclude-regex (string) Exclude all the relative path of the files that align with regular expressions. Separate regular expressions with ';'.

Let me know if you have any other questions!

@mweibel
Copy link
Author

mweibel commented Feb 28, 2024

hi @siminsavani-msft. Thank you for your reply.

To clarify a bit more our use case:

  • the files are generated by a third party software we don't control
  • what files are generated we don't know beforehand (i.e. we don't know names etc)
  • the 0 byte files get removed after a while and then at some point when the task is done, the real file gets written

Therefore the exclude path/pattern/regex flags don't work in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants