Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added a count command #290

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

gardenia
Copy link

@gardenia gardenia commented Apr 9, 2022

This adds basic support for the count command:

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html#count

Currently no command-line options supported (other than -h) but I find this adds value to my own use-cases as is.

@colinmarc
Copy link
Owner

Hi @gardenia, thanks for the contribution. This command feels pretty un-unixy, which makes me think it's maybe not a good fit for this binary, but I'm open to discussing it. Are there usecases not covered by ls [-R] | wc?

@gardenia
Copy link
Author

gardenia commented Sep 2, 2022

Hi @colinmarc ,

The main benefits as I see it are:

  • the count command provides the convenience of reporting
    Number dirs
    Number files
    Number bytes

For a particular path in a single command invocation

  • It uses an efficient single RPC call provided by the hdfs namenode (getContentSummary). So on a large cluster (large number of files/dirs) a ls -R may be many round trips and pulling back big listings just to compute the same information whereas this count implementation is just a single RPC

  • The "count" command is standard within hdfs and therefore is something users of the original Java Hadoop command line may be familiar with and have got used to relying upon (as I myself had)

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html#count

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants