Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation unclear for wbt_principal_component_analysis #119

Closed
S-AQ opened this issue Nov 15, 2023 · 3 comments
Closed

Documentation unclear for wbt_principal_component_analysis #119

S-AQ opened this issue Nov 15, 2023 · 3 comments

Comments

@S-AQ
Copy link

S-AQ commented Nov 15, 2023

Recommendation: improve the inputs argument description to something like 'concatenated character string of file names separated by a semicolon :'

#' @param inputs Input raster files.

Hi,

Thanks for the wonderful toolbox and the great package.

I wanted to run a principal component analysis but had great trouble figuring out how to define the proper data type for the inputs argument. The documentation simply said "Input raster files.", so I tried my luck at multiple options:

  • A raster stack (in the form of my_pca_raster_bands.tif)
  • A list of files (obtained by list.files(path='my/files/path', full.names =T ))
  • A vector of files (obtained by as.vector( list.files(path='my/files/path', full.names =T ) ))

The first option was met with the error: thread 'main' panicked at 'There is something incorrect about the input files. At least three inputs are required to operate this tool.', whitebox-tools-app\src\main.rs:72:21

So then I decided to offer separate inputs with the latter two options, which were met with the following error: Error in wbt_file_path(inputs) : length(x) == 1 is not TRUE,

It was only when I looked at the command line interface, that I saw what had to be the input format: -i='image1.tif;image2.tif;image3.tif'.

So I did:

paste(list.files(path='my/files/path', full.names =T ), collapse = ';')

I would recommend updating the description of the ìnputs` parameter and perhaps also enabling a list of filenames and/or a raster stack as allowed input types.

@brownag
Copy link
Member

brownag commented Nov 15, 2023

Hello @S-AQ. This is a great suggestion.

I probably could update the internal wbt_file_path() function so that it is exported and behavior fully documented. I could also extend it such that it can work when specified multiple paths as a list or vector. However a multi-band raster is out of scope--in general WhiteboxTools will not support these inputs except for limited cases such as RGBA images.

I will not be making changes to specific help file contents for the R package at this time. There are many tools that would benefit from more specific instructions and R-based example code.
There is no customization of the parameter info by the R package interface, aside from a small number of relatively simple functions that get examples added to the helpfile. I will leave this issue open until I can consider a few other options.

I wind up fielding a lot of questions that are more questions on how to use WhiteboxTools rather than how to use the R package, but I think, as an R user myself, is reasonable to expect more complete documentation. At this time the help file information in question is derived from the whitebox_tools.py file found here:

https://github.com/jblindsay/whitebox-tools/blob/5a82f513e77cf1c74778995b5d6304dd9d9f372f/whitebox_tools.py#L9601-L9609

I would like to be able to include more information from the manual e.g. https://www.whiteboxgeo.com/manual/wbt_book/available_tools/mathand_stats_tools.html?highlight=principal%20components#principalcomponentanalysis but even the full manual page does not specify in the parameter description how to have multiple input files. As you point out, you must look at the example command to figure that out.

The WhiteboxTools supported data formats manual section goes into some more detail on what you should expect to be able to do. https://www.whiteboxgeo.com/manual/wbt_book/supported_formats.html

If you would like to use a SpatRaster/RasterLayer as input, consider the wbt() function which is somewhat more flexible, and supports R spatial objects backed by files as inputs, but still would not support multiband input in this case.

@brownag
Copy link
Member

brownag commented Nov 18, 2023

I have added documentation for multiple file paths on how to concatenate, and reference the wbt_file_path() method. Further the wbt_file_path() method has been extended to work with length > 1 and terra object inputs with #121

The following now works:

library(whitebox)
library(terra)
#> terra 1.7.60

wbt_verbose(TRUE)

# works
r <- rast(system.file("ex", "elev.tif", package = "terra"))
wbt_slope(r, "./slope.tif")
#> slope - Elapsed Time (excluding I/O): 0.2s

# works
v <- vect(system.file("ex", "lux.shp", package = "terra"))
wbt_clean_vector(v, "./test.shp")
#> clean_vector - Elapsed Time: 0.0s

# works
r3 <- rast(lapply(names(logo), \(n) writeRaster(logo[[n]], paste0(n, ".tif"), overwrite = TRUE)))
wbt_principal_component_analysis(r3, "./pca.html")
#> principal_component_analysis - Elapsed Time (including I/O): 0.322s
# browseURL("pca.html")

These (multiband input, and raster in memory) do not work at this time. Making them work would require writing to temporary files that are then passed to WhiteboxTools. The wbt() method does conversion for vector data source -> Shapefile, and in memory raster -> GeoTIFF, but does not do it (yet) for multiband inputs that need to be separated into individual files. At some point, in theory, this behaviors could be extended for all wbt_* function inputs so we have fake multiband support etc. However, I think the behaviour would need to be gated behind an option outside of wbt() method so users do not incur performance/storage penalty.

# does not work "At least three inputs are required to operate this tool."
logo <- rast(system.file("ex", "logo.tif", package = "terra"))
wbt_principal_component_analysis(logo, "./pca.html")
#> 
#> Error running WhiteboxTools (PrincipalComponentAnalysis)
#>   whitebox.exe_path: '/home/andrew/.local/share/R/whitebox/WBT/whitebox_tools'; File exists? TRUE
#>   Arguments: --run=PrincipalComponentAnalysis  --inputs='/home/andrew/R/x86_64-pc-linux-gnu-library/4.3/terra/ex/logo.tif' --output='./pca.html' -v
#> System command had status 101
#> *****************************************
#> * Welcome to PrincipalComponentAnalysis *
#> * Powered by WhiteboxTools              *
#> * www.whiteboxgeo.com                   *
#> *****************************************
#> principal_component_analysis - Elapsed Time: NA [did not run]

# informative error (raster in memory)
r2 <- r*2
wbt_slope(r2, "./slope2.tif")
#> Error: The supplied 'terra' object for `dem` is not backed by a file. Try loading the object directly from the source file with `terra::rast()` or `terra::vect()`. Various raster formats and ESRI Shapefile are supported. See <https://www.whiteboxgeo.com/manual/wbt_book/supported_formats.html> for details.

@brownag
Copy link
Member

brownag commented May 27, 2024

I am going to close this issue now. wbt_file_path() hopefully clarifies how paths can be specified, and R object surrogates can be used to pass paths to WhiteboxTools.

If, in the future, the metadata for functions and their arguments (available via wbttools and wbttoolparameters data sets in the R package) get updated then we could add that information to the R manual. However, I am not planning on doing any custom parsing of the HTML manual contents at this time.

@brownag brownag closed this as completed May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants