Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI arguments are inconsistent #243

Open
vlaci opened this issue Feb 8, 2022 · 6 comments
Open

CLI arguments are inconsistent #243

vlaci opened this issue Feb 8, 2022 · 6 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Milestone

Comments

@vlaci
Copy link
Contributor

vlaci commented Feb 8, 2022

Now almost all command line arguments have a short and a long name. Most short names are undecipherable or misleading without reading through the help. Also, few argument names leak implementation details. We shouldn't expose short switches for more obscure functionality at all and the ones we are exposing should be as intuitive as possible.

I think we should double think about the names of these switches before 1.0:

  • -n: It means dry-run in most command line programs and it may make sense to have a dry-run switch in the future, but right now it is configuring entropy calculation.
  • -e: Output location can be configured either via positional argument or -o many times. As we are supporting arbitrary input files, we shouldn't use a positional argument though.
  • --process-num/-p: There is no generally established name here. Some times -j is used to denote "jobs", or -p to indicate parallelism. I think we shouldn't indicate on our public API that we are using processes for parallelism. (It is also true for the actual process_file function too)
  • --show-external-dependencies this is an odd one we agreed about in the past that it doesn't really seem right and may make sense to use a subcommand instead but we didn't want to introduce them just for this functionality. If we are to use subcommands elsewhere, we should reconsider this as well

Suggested help output (for existing functionality)

$ unblob --help
Usage: unblob [OPTIONS] FILES...

  A tool for getting information out of any kind of binary blob.

  See '--show-external-dependencies' for details on required 3rd party tools.

Common Options:
  --help                          Show this message and exit.
  -v, --verbose                   Verbosity level, counting, maximum level: 3
                                  (use: -v, -vv, -vvv)
  -V, --version

Extraction Options:
  -d, --depth INTEGER RANGE       Recursion depth. How deep should we extract
                                  containers.  [default: 10; x>=1]
  -j, --jobs INTEGER RANGE
                                  Number of jobs to process files
                                  parallelly.  [default: 8; x>=1]
  -o, --output DIRECTORY          Extract the files to this directory. Will be
                                  created if doesn't exist.
  -P, --plugins-path PATH         Load plugins from the provided path.


Special Options:
  --entropy-depth INTEGER RANGE
                                  Entropy calculation depth. How deep should
                                  we calculate entropy for unknown files? 1
                                  means input files only, 0 turns it off.
                                  [default: 1; x>=0]
  --show-external-dependencies    Shows commands need to be available for
                                  unblob to work properly

Suggested help format containing potential future options

$ unblob --help
Usage: unblob [OPTIONS] FILES...

  A tool for getting information out of any kind of binary blob.

  See '--show-external-dependencies' for details on required 3rd party tools.

Common Options:
  --help                          Show this message and exit.
  -v, --verbose                   Verbosity level, counting, maximum level: 3
                                  (use: -v, -vv, -vvv)
  -V, --version

Extraction Options:
  -f, --force                     Overwrite already existing output
  -d, --depth INTEGER RANGE       Recursion depth. How deep should we extract
                                  containers.  [default: 10; x>=1]
  -j, --jobs INTEGER RANGE
                                  Number of jobs to process files
                                  parallelly.  [default: 8; x>=1]
  -o, --output DIRECTORY          Extract the files to this directory. Will be
                                  created if doesn't exist.
  -P, --plugins-path PATH         Load plugins from the provided path.

Special Options:
  --extract-config [KEY=VALUE...] Control finer details of extraction process.
                                  Plugins can register extra options.
                                  See '--extract-config help' for details.
  --show-external-dependencies    Shows commands need to be available for
                                  unblob to work properly

$ unblob --extract-config help
Usage: unblob --extract-config [KEY=VALUE...]

The following configuration options are available:

  cleanup_output                  Remove intermediate files [...]
  entropy_depth                   Entropy calculation depth. How deep should
                                  we calculate entropy for unknown files? 1
                                  means input files only, 0 turns it off.
                                  [default: 1; x>=0]
  ignore_magic                    Do not extract files matching to the given magic

Potential future direction with subcommands

Currently we have only the default implicit extract and the somewhat clunky show-external-dependencies functionality which doesn't warrant the addition of subcommands. If we want to add in the future e.g. Forcing a given extractor to a file, it could make sense to add subcommands (I know, I know, unpack and extract are awful verbs to use together...)

$ unblob --help
Usage: unblob [SUBCOMMAND]

Options:
  --help                          Show this message and exit.
  -v, --verbose                   Verbosity level, counting, maximum level: 3
                                  (use: -v, -vv, -vvv)
  -V, --version

Subcommands:
  extract                         Extracts a binary blob. Default command if unspecified.
  unpack                          Unpacks a file using the specified extractor
  help                            Show this help, or the help of the given subcommand
@vlaci vlaci added this to the v1.0 - extraction milestone Feb 8, 2022
@qkaiser qkaiser added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 8, 2022
@qkaiser
Copy link
Contributor

qkaiser commented Feb 8, 2022

Can you show us your ideal --help output ? :)

I think the -e probably comes from our usage of binwalk in the past, I'd gladly change it to -o / --output-dir which would fall in line with the CLI of most of our extractors (7zip, jefferson, ...). Note: if we do so, we also have to change that option in unromfs.

We can decide to reserve -n for future use (dry-run switch) and move the entropy calculation switch to something more meaningful. The symbol for entropy is S, so maybe use -s / --entropy-depth ?

I don't have opinions on --show-external-dependencies at the moment. I'm not against it.

Regarding -p:

I think we shouldn't indicate on our public API that we are using processes for parallelism. (It is also true for the actual process_file function too)

You would keep the option and hide it from the --help section ? Or just hide it from the general documentation (README, wiki) ? In terms of options both -j and -p work for me.

@vlaci
Copy link
Contributor Author

vlaci commented Feb 8, 2022

Can you show us your ideal --help output ? :)

I think the -e probably comes from our usage of binwalk in the past, I'd gladly change it to -o / --output-dir which would fall in line with the CLI of most of our extractors (7zip, jefferson, ...). Note: if we do so, we also have to change that option in unromfs.

Interestingly, binwalk's use of -e is different from ours, they have -C˙for the output directory.

We can decide to reserve -n for future use (dry-run switch) and move the entropy calculation switch to something more meaningful. The symbol for entropy is S, so maybe use -s / --entropy-depth ?

I don't think that we need a short name for that

I don't have opinions on --show-external-dependencies at the moment. I'm not against it.

Regarding -p:

I think we shouldn't indicate on our public API that we are using processes for parallelism. (It is also true for the actual process_file function too)

You would keep the option and hide it from the --help section ? Or just hide it from the general documentation (README, wiki) ? In terms of options both -j and -p work for me.

I'd probably rename it to --parallel(ism) to convey the meaning without making the exact mode of parallelism out of the API. E.g. if we were decide that threads are enough, I don't want to rename this switch.

@vlaci
Copy link
Contributor Author

vlaci commented Feb 8, 2022

If there is a general consensus that we should rework the switches I am willing to design it :) At this stage it is more of a question to see if you are agreeing with my assessment.

@martonilles
Copy link
Contributor

Maybe we should also add a --version argument

@vlaci
Copy link
Contributor Author

vlaci commented Mar 8, 2022

Updated OP with suggested changes in help, also containing a potential future direction to add extensible extract configuration options without cluttering the core UX

@qkaiser
Copy link
Contributor

qkaiser commented Aug 31, 2023

Maybe we should also add a --version argument

This flag is available with the latest version (23.8.11).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants