Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SITS: Satellite Image Time Series Analysis for Earth Observation Data Cubes (submit R package for review) #596

Open
14 of 15 tasks
gilbertocamara opened this issue Jul 2, 2023 · 47 comments
Assignees

Comments

@gilbertocamara
Copy link

gilbertocamara commented Jul 2, 2023

Submitting Author Name: Gilberto Camara
Submitting Author Github Handle: @gilbertocamara
Other Package Authors Github handles: @rolfsimoes, @OldLipe, @pedro-andrade-inpe
Repository: https://github.com/e-sensing/sits
Version submitted: 1.4.2
Submission type: Standard
Editor: TBD
Reviewers: TBD

Archive: TBD
Version accepted: TBD
Language: en

  • Paste the full DESCRIPTION file inside a code block below:
Package: sits
Type: Package
Version: 1.4.2
Title: Satellite Image Time Series Analysis for Earth Observation Data Cubes
Authors@R: c(person('Rolf', 'Simoes', role = c('aut'), email = 'rolf.simoes@inpe.br'),
             person('Gilberto', 'Camara', role = c('aut', 'cre'), email = 'gilberto.camara.inpe@gmail.com'),
             person('Felipe', 'Souza', role = c('aut'), email = 'felipe.carvalho@inpe.br'),
             person('Lorena', 'Santos', role = c('aut'), email = 'lorena.santos@inpe.br'),
             person('Pedro', 'Andrade', role = c('aut'), email = 'pedro.andrade@inpe.br'),
             person('Karine', 'Ferreira', role = c('aut'), email = 'karine.ferreira@inpe.br'),
             person('Alber', 'Sanchez', role = c('aut'), email = 'alber.ipia@inpe.br'),
             person('Gilberto', 'Queiroz', role = c('aut'), email = 'gilberto.queiroz@inpe.br')
             )
Maintainer: Gilberto Camara <gilberto.camara.inpe@gmail.com>
Description: An end-to-end toolkit for land use and land cover classification
    using big Earth observation data, based on machine learning methods 
    applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>.
    Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, 
    Brazil Data Cube, and Digital Earth Africa using the Spatio-temporal Asset Catalog (STAC) 
    protocol (<https://stacspec.org/> and the 'gdalcubes' R package 
    developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>.
    Supports visualization methods for images and time series and 
    smoothing filters for dealing with noisy time series.
    Includes functions for quality assessment of training samples using self-organized maps 
    as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. 
    Provides machine learning methods including support vector machines, 
    random forests, extreme gradient boosting, multi-layer perceptrons,
    temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, 
    residual networks by Fawaz et al (2019) <doi:10.1007/s10618-019-00619-1>, and temporal attention encoders
    by Garnot and Landrieu (2020) <arXiv:2007.00586>.
    Performs efficient classification of big Earth observation data cubes and includes 
    functions for post-classification smoothing based on Bayesian inference, and 
    methods for uncertainty assessment. Enables best
    practices for estimating area and assessing accuracy of land change as 
    recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>.
    Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.
Encoding: UTF-8
Language: en-US
Depends: R (>= 4.1.0)
URL: https://github.com/e-sensing/sits/, https://e-sensing.github.io/sitsbook/
BugReports: https://github.com/e-sensing/sits/issues
License: GPL-2
ByteCompile: true
LazyData: true
Imports:
    yaml,
    dplyr (>= 1.0.0),
    gdalUtilities,
    grDevices,
    graphics,
    lubridate,
    parallel (>= 4.0.5),
    purrr (>= 0.3.0),
    Rcpp,
    rstac (>= 0.9.2-3),
    sf (>= 1.0-12),
    slider (>= 0.2.0),
    stats,
    terra (>= 1.5-17),
    tibble (>= 3.1),
    tidyr (>= 1.2.0),
    torch (>= 0.9.0),
    utils
Suggests:
    caret,
    dendextend,
    dtwclust,
    DiagrammeR,
    digest,
    e1071,
    exactextractr,
    FNN,
    future,
    gdalcubes (>= 0.6.0),
    geojsonsf,
    ggplot2,
    httr,
    jsonlite,
    kohonen (>= 3.0.11),
    leafem (>= 0.2.0),
    leaflet (>= 2.1.1),
    luz (>= 0.3.0),
    methods,
    mgcv,
    nnet,
    openxlsx,
    randomForest,
    randomForestExplainer,
    RcppArmadillo (>= 0.11),
    scales,
    stars (>= 0.6),
    supercells,
    testthat (>= 3.1.3),
    tmap (>= 3.3),
    torchopt (>= 0.1.2),
    xgboost,
    covr
Config/testthat/edition: 3
Config/testthat/parallel: false
Config/testthat/start-first: cube, raster, regularize, data, ml
LinkingTo:
    Rcpp,
    RcppArmadillo
RoxygenNote: 7.2.3
Collate: 
    'api_accessors.R'
    'api_accuracy.R'
    'api_apply.R'
    'api_band.R'
    'api_bbox.R'
    'api_block.R'
    'api_check.R'
    'api_chunks.R'
    'api_classify.R'
    'api_cluster.R'
    'api_colors.R'
    'api_combine_predictions.R'
    'api_comp.R'
    'api_conf.R'
    'api_csv.R'
    'api_cube.R'
    'api_data.R'
    'api_debug.R'
    'api_download.R'
    'api_factory.R'
    'api_file_info.R'
    'api_file.R'
    'api_gdal.R'
    'api_gdalcubes.R'
    'api_imputation.R'
    'api_jobs.R'
    'api_label_class.R'
    'api_mixture_model.R'
    'api_ml_model.R'
    'api_mosaic.R'
    'api_parallel.R'
    'api_period.R'
    'api_plot_time_series.R'
    'api_plot_raster.R'
    'api_point.R'
    'api_predictors.R'
    'api_raster.R'
    'api_raster_sub_image.R'
    'api_raster_terra.R'
    'api_reclassify.R'
    'api_roi.R'
    'api_samples.R'
    'api_segments.R'
    'api_sf.R'
    'api_shp.R'
    'api_signal.R'
    'api_smooth.R'
    'api_smote.R'
    'api_som.R'
    'api_source.R'
    'api_source_aws.R'
    'api_source_bdc.R'
    'api_source_deafrica.R'
    'api_source_hls.R'
    'api_source_local.R'
    'api_source_mpc.R'
    'api_source_sdc.R'
    'api_source_stac.R'
    'api_source_usgs.R'
    'api_space_time_operations.R'
    'api_stac.R'
    'api_stats.R'
    'api_summary.R'
    'api_tibble.R'
    'api_tile.R'
    'api_timeline.R'
    'api_torch.R'
    'api_torch_psetae.R'
    'api_ts.R'
    'api_tuning.R'
    'api_uncertainty.R'
    'api_utils.R'
    'api_variance.R'
    'api_view.R'
    'RcppExports.R'
    'data.R'
    'sits-package.R'
    'sits_apply.R'
    'sits_accuracy.R'
    'sits_active_learning.R'
    'sits_bands.R'
    'sits_bbox.R'
    'sits_classify.R'
    'sits_colors.R'
    'sits_combine_predictions.R'
    'sits_config.R'
    'sits_csv.R'
    'sits_cube.R'
    'sits_cube_copy.R'
    'sits_cluster.R'
    'sits_factory.R'
    'sits_filters.R'
    'sits_geo_dist.R'
    'sits_get_data.R'
    'sits_labels.R'
    'sits_label_classification.R'
    'sits_lighttae.R'
    'sits_machine_learning.R'
    'sits_merge.R'
    'sits_mixture_model.R'
    'sits_mlp.R'
    'sits_mosaic.R'
    'sits_model_export.R'
    'sits_patterns.R'
    'sits_plot.R'
    'sits_predictors.R'
    'sits_reclassify.R'
    'sits_regularize.R'
    'sits_resnet.R'
    'sits_sample_functions.R'
    'sits_segmentation.R'
    'sits_select.R'
    'sits_sf.R'
    'sits_smooth.R'
    'sits_som.R'
    'sits_summary.R'
    'sits_tae.R'
    'sits_tempcnn.R'
    'sits_timeline.R'
    'sits_train.R'
    'sits_tuning.R'
    'sits_utils.R'
    'sits_uncertainty.R'
    'sits_validate.R'
    'sits_view.R'
    'sits_values.R'
    'sits_variance.R'
    'sits_xlsx.R'
    'zzz.R'

Scope

  • Please indicate which category or categories from our package fit policies this package falls under:

    • [X ] geospatial data
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):
    sits is a package for satellite image time series analysis, that works with big Earth observation data sets.

  • Who is the target audience and what are scientific applications of this package?
    The target audience is made of remote sensing and environmental experts that want to classify remote sensing images for applications such as deforestation detection, agricultural and land use/land cover mapping, biodiversity conservation, and land degradation monitoring.,

  • Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
    There are currently no other open source software packages that have the same capabilities.

  • (If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
    Not applicable

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
    Not applicable

  • Explain reasons for any pkgcheck items which your package is unable to pass.

(a) Vignettes: instead of preparing vignettes, the authors have written an on-line book that describes the contents of the package in detail. The book is available at the URL https://e-sensing.github.io/sitsbook/

Important notes:

(1) To run the tests, examples, and code coverage, please make
sure the following environment variables are set in the R session:
Sys.setenv("SITS_RUN_TESTS" = "YES")
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")
sits is a fairly large package, and the tests take a long time to run, since they access cloud services. For this reason, testing needs to be manually enabled.

(2) Please review version 1.4.2, not yet on CRAN, which is available in the "dev" branch in the github repository.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?
    The package is already on CRAN.

  • Do you intend for this package to go on Bioconductor?

  • [ x] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for sits (v1.4.1)

git hash: 6eac9edf

  • ✖️ Package name is not available (on CRAN).
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✖️ The following function has no documented return value: [sits_filter]
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✖️ Package has no HTML vignettes
  • ✖️ These functions do not have examples: [plot.sits_cluster, sits_filter, sits_list_collections].
  • ✔️ Package has continuous integration checks.
  • ✖️ Package coverage is 0.1% (should be at least 75%).
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL-2


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 1922
internal sits 134
internal datasets 3
internal tools 1
imports utils 259
imports stats 178
imports purrr 129
imports dplyr 97
imports tibble 67
imports torch 60
imports graphics 52
imports sf 52
imports rstac 48
imports terra 48
imports slider 47
imports tidyr 43
imports grDevices 42
imports lubridate 18
imports yaml 5
imports gdalUtilities NA
imports parallel NA
imports Rcpp NA
suggests ggplot2 67
suggests leaflet 16
suggests digest 15
suggests tmap 13
suggests httr 12
suggests stars 12
suggests luz 11
suggests mgcv 5
suggests dtwclust 4
suggests xgboost 4
suggests caret 3
suggests FNN 3
suggests scales 3
suggests gdalcubes 2
suggests kohonen 2
suggests methods 2
suggests openxlsx 2
suggests e1071 1
suggests exactextractr 1
suggests randomForest 1
suggests randomForestExplainer 1
suggests supercells 1
suggests dendextend NA
suggests DiagrammeR NA
suggests future NA
suggests geojsonsf NA
suggests jsonlite NA
suggests leafem NA
suggests nnet NA
suggests RcppArmadillo NA
suggests testthat NA
suggests torchopt NA
suggests covr NA
linking_to Rcpp NA
linking_to RcppArmadillo NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (172), source (167), c (162), labels (100), nrow (72), length (62), file (55), scale (41), names (39), row (39), seq_len (38), unlist (37), col (32), date (31), unique (28), lapply (25), unname (25), q (23), sum (23), toupper (23), for (22), paste (22), return (19), as.Date (18), options (18), as.numeric (17), paste0 (17), seq_along (17), ceiling (15), class (15), apply (14), dim (14), environment (13), round (12), which (12), as.character (11), as.matrix (11), Sys.time (10), try (10), factor (9), rep (9), seq (9), sqrt (9), suppressMessages (9), url (9), array (8), do.call (8), format (8), matrix (8), ncol (8), rbind (8), sample (8), substitute (8), t (8), table (8), by (7), colnames (7), mean (7), Sys.getenv (7), tryCatch (7), character (6), data.frame (6), floor (6), formals (6), gamma (6), levels (6), strsplit (6), structure (6), version (6), which.min (6), as.integer (5), if (5), missing (5), replace (5), suppressWarnings (5), which.max (5), abs (4), any (4), as.list (4), choose (4), grepl (4), inherits (4), list2env (4), mode (4), rev (4), row.names (4), setdiff (4), tempdir (4), args (3), cbind (3), colSums (3), diag (3), eval (3), get (3), is.na (3), path.expand (3), plot (3), prop.table (3), raw (3), readRDS (3), signif (3), socketSelect (3), split (3), switch (3), system.file (3), vapply (3), as.factor (2), debug (2), deparse (2), double (2), exp (2), gsub (2), I (2), is.logical (2), log (2), logical (2), max.col (2), parent.env (2), pmin (2), quote (2), range (2), rawConnection (2), rowSums (2), sort (2), subset (2), substring (2), summary (2), tolower (2), write (2), all (1), all.vars (1), as.vector (1), basename (1), cat (1), cut (1), difftime (1), dir (1), dir.exists (1), dirname (1), drop (1), exists (1), file.exists (1), gc (1), integer (1), is.null (1), kappa (1), lengths (1), list.files (1), new.env (1), numeric (1), open (1), order (1), parent.frame (1), rbind.data.frame (1), remove (1), sample.int (1), srcfile (1), svd (1), sys.frames (1), tapply (1), tempfile (1), vector (1), within (1)

utils

data (249), txtProgressBar (3), head (2), write.csv (2), getTxtProgressBar (1), read.csv (1), tail (1)

stats

ts (44), offset (36), formula (14), D (8), kernel (8), time (8), runif (6), predict (5), dt (4), na.action (4), quantile (4), sd (4), df (3), step (3), weights (3), as.formula (2), end (2), median (2), rbeta (2), rf (2), rlnorm (2), rnorm (2), start (2), addmargins (1), as.dendrogram (1), cutree (1), filter (1), reshape (1), smooth (1), var (1), window (1)

sits

sits_bands (25), sits_labels (21), sits_timeline (19), sits_select (10), train_fun (8), C_mask_na (3), mixture_fn (3), C_label_max_prob (2), C_max_sampling (2), comb_fn (2), download_fn (2), filter_call (2), label_fn (2), reclassify_fn (2), sits_train (2), sits_values (2), smooth_fn (2), .debug (1), C_entropy_probs (1), C_fill_na (1), C_kernel_max (1), C_kernel_mean (1), C_kernel_median (1), C_kernel_min (1), C_kernel_sd (1), C_least_probs (1), C_margin_probs (1), C_nnls_solver_batch (1), C_normalize_data (1), C_normalize_data_0 (1), create_iqr (1), impute_fun (1), plot_samples (1), replace_na (1), result_fun (1), seg_fun (1), sits_as_sf (1), sits_colors (1), sits_som_map (1), sits_validate (1), sits_view.sits (1), submit (1)

purrr

map (63), map_chr (16), map_dfr (12), map_dfc (7), map2_dfr (7), pmap_dfr (6), map_lgl (5), pmap (4), is_character (2), map_dbl (2), imap_dfc (1), map_int (1), map2 (1), pmap_lgl (1), transpose (1)

dplyr

filter (16), mutate (16), bind_rows (11), bind_cols (9), all_of (7), select (6), n (5), group_by (4), slice_sample (4), distinct (3), slice_head (3), left_join (2), select_if (2), starts_with (2), tibble (2), cur_group_id (1), inner_join (1), matches (1), summarise (1), transmute (1)

ggplot2

ggplot (12), aes (9), element_text (7), scale_fill_manual (5), element_rect (3), geom_line (3), labs (3), theme (3), .pt (2), element_blank (2), facet_grid (2), geom_point (2), geom_rect (2), guide_legend (2), position_dodge (2), scale_x_continuous (2), facet_wrap (1), geom_histogram (1), guides (1), scale_color_brewer (1), scale_x_date (1), scale_x_log10 (1)

tibble

tibble (49), as_tibble_row (10), as_tibble (7), lst (1)

torch

nn_module (20), torch_tensor (11), nn_cross_entropy_loss (5), nn_softmax (5), nn_batch_norm1d (4), torch_matmul (3), torch_zeros (3), nn_init_normal_ (2), torch_stack (2), torch_transpose (2), nn_dropout (1), torch_exp (1), torch_mean (1)

graphics

segments (15), title (12), points (10), legend (9), plot (3), abline (2), grid (1)

sf

st_crs (8), st_sample (7), st_transform (7), st_as_sf (5), st_drop_geometry (3), st_intersects (3), st_coordinates (2), st_geometry (2), st_geometry_type (2), st_intersection (2), st_write (2), read_sf (1), st_bbox (1), st_convex_hull (1), st_distance (1), st_is_empty (1), st_polygon (1), st_sf (1), st_sfc (1), st_within (1)

rstac

post_request (15), items_reap (11), items_fetch (9), items_matched (6), sign_planetary_computer (3), ext_query (2), stac (1), stac_search (1)

terra

rast (7), ext (4), spatSample (4), xFromCol (4), yFromRow (4), readValues (3), crop (2), crs (2), expanse (2), extract (2), freq (2), fileBlocksize (1), ncol (1), nlyr (1), nrow (1), readStart (1), values (1), xmax (1), xmin (1), xres (1), ymax (1), ymin (1), yres (1)

slider

slide (18), slide_dfr (16), slide_dbl (4), slide_lgl (4), slide2_dfr (3), slide_chr (1), slide2_lgl (1)

tidyr

nest (14), unnest (11), expand_grid (9), pivot_longer (6), starts_with (2), everything (1)

grDevices

colors (30), palette (9), hcl.colors (3)

lubridate

as_date (13), fast_strptime (1), mday (1), month (1), year (1), ymd (1)

leaflet

layersControlOptions (12), colorFactor (2), leaflet (1), providers$GeoportailFrance.orthos (1)

digest

digest (15)

tmap

tm_borders (5), tm_shape (3), tm_layout (2), tm_facets (1), tm_raster (1), tmap_options (1)

httr

add_headers (4), write_disk (3), parse_url (2), build_url (1), content (1), GET (1)

stars

read_stars (8), st_warp (4)

luz

luz_metric_accuracy (5), setup (5), luz_callback_early_stopping (1)

mgcv

gam (3), predict.gam (2)

yaml

yaml.load_file (4), as.yaml (1)

dtwclust

hierarchical_control (2), cvi (1), tsclust (1)

xgboost

xgb.plot.tree (3), xgboost (1)

caret

confusionMatrix (3)

datasets

trees (3)

FNN

knnx.index (3)

scales

date_format (1), label_number (1), pretty_breaks (1)

gdalcubes

gdalcubes_options (1), image_collection (1)

kohonen

somgrid (1), supersom (1)

methods

S3Part (2)

openxlsx

createWorkbook (1), saveWorkbook (1)

e1071

svm (1)

exactextractr

exact_extract (1)

randomForest

randomForest (1)

randomForestExplainer

plot_min_depth_distribution (1)

supercells

supercells (1)

tools

file_path_sans_ext (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in C++ (5% in 14 files) and R (95% in 131 files)
  • 8 authors
  • no vignette
  • 4 internal data files
  • 18 imported packages
  • 164 exported functions (median 11 lines of code)
  • 1988 non-exported functions in R (median 7 lines of code)
  • 73 R functions (median 11 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 131 99.5
files_src 14 95.0
files_vignettes 0 0.0 TRUE
files_tests 46 99.0
loc_R 18106 99.6 TRUE
loc_src 1021 62.7
loc_tests 5220 98.3 TRUE
num_vignettes 0 0.0 TRUE
data_size_total 186830 86.9
data_size_median 50517 88.2
n_fns_r 2152 99.9 TRUE
n_fns_r_exported 164 97.9 TRUE
n_fns_r_not_exported 1988 99.9 TRUE
n_fns_src 73 74.2
n_fns_per_file_r 9 84.0
n_fns_per_file_src 5 49.3
num_params_per_fn 4 54.6
loc_per_fn_r 8 20.0
loc_per_fn_r_exp 11 25.1
loc_per_fn_r_not_exp 7 18.0
loc_per_fn_src 11 28.5
rel_whitespace_R 12 98.9 TRUE
rel_whitespace_src 13 56.8
rel_whitespace_tests 14 97.2 TRUE
doclines_per_fn_exp 44 55.5
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 3546 99.5 TRUE

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml

GitHub Workflow Results

id name conclusion sha run_number date
5438924419 R-CMD-check success bc5d6c 116 2023-07-02

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking installed package size ... NOTE
    installed size is 16.7Mb
    sub-directories of 1Mb or more:
    libs 14.1Mb

R CMD check generated the following check_fail:

  1. rcmdcheck_reasonable_installed_size

Test coverage with covr

Package coverage: 0.08

The following files are not completely covered by tests:

file coverage
R/api_accessors.R 0%
R/api_accuracy.R 0%
R/api_apply.R 0%
R/api_band.R 0%
R/api_bbox.R 0%
R/api_block.R 0%
R/api_check.R 0%
R/api_chunks.R 0%
R/api_classify.R 0%
R/api_cluster.R 0%
R/api_combine_predictions.R 0%
R/api_comp.R 0%
R/api_conf.R 0%
R/api_csv.R 0%
R/api_cube.R 0%
R/api_data.R 0%
R/api_debug.R 0%
R/api_download.R 0%
R/api_expressions.R 0%
R/api_factory.R 0%
R/api_file_info.R 0%
R/api_file.R 0%
R/api_gdal.R 0%
R/api_gdalcubes.R 0%
R/api_imputation.R 0%
R/api_jobs.R 0%
R/api_label_class.R 0%
R/api_mixture_model.R 0%
R/api_ml_model.R 0%
R/api_mosaic.R 0%
R/api_parallel.R 0%
R/api_period.R 0%
R/api_plot_raster.R 0%
R/api_plot_time_series.R 0%
R/api_point.R 0%
R/api_predictors.R 0%
R/api_raster_sub_image.R 0%
R/api_raster_terra.R 0%
R/api_raster.R 0%
R/api_reclassify.R 0%
R/api_roi.R 0%
R/api_samples.R 0%
R/api_segments.R 0%
R/api_sf.R 0%
R/api_shp.R 0%
R/api_signal.R 0%
R/api_smooth.R 0%
R/api_smote.R 0%
R/api_som.R 0%
R/api_source_aws.R 0%
R/api_source_bdc.R 0%
R/api_source_deafrica.R 0%
R/api_source_hls.R 0%
R/api_source_local.R 0%
R/api_source_mpc.R 0%
R/api_source_sdc.R 0%
R/api_source_stac.R 0%
R/api_source_usgs.R 0%
R/api_source.R 0%
R/api_space_time_operations.R 0%
R/api_stac.R 0%
R/api_stats.R 0%
R/api_summary.R 0%
R/api_tibble.R 0%
R/api_tile.R 0%
R/api_timeline.R 0%
R/api_torch.R 0%
R/api_ts.R 0%
R/api_tuning.R 0%
R/api_uncertainty.R 0%
R/api_utils.R 0%
R/api_variance.R 0%
R/api_view.R 0%
R/sits_accuracy.R 0%
R/sits_active_learning.R 0%
R/sits_apply.R 0%
R/sits_bands.R 0%
R/sits_bbox.R 0%
R/sits_classify.R 0%
R/sits_cluster.R 0%
R/sits_colors.R 0%
R/sits_combine_predictions.R 0%
R/sits_config.R 0%
R/sits_csv.R 0%
R/sits_cube_copy.R 0%
R/sits_cube.R 0%
R/sits_factory.R 0%
R/sits_filters.R 0%
R/sits_geo_dist.R 0%
R/sits_get_data.R 0%
R/sits_label_classification.R 0%
R/sits_labels.R 0%
R/sits_lighttae.R 0%
R/sits_machine_learning.R 0%
R/sits_merge.R 0%
R/sits_mixture_model.R 0%
R/sits_mlp.R 0%
R/sits_model_export.R 0%
R/sits_mosaic.R 0%
R/sits_patterns.R 0%
R/sits_plot.R 0%
R/sits_predictors.R 0%
R/sits_reclassify.R 0%
R/sits_regularize.R 0%
R/sits_resnet.R 0%
R/sits_sample_functions.R 0%
R/sits_segmentation.R 0%
R/sits_select.R 0%
R/sits_sf.R 0%
R/sits_smooth.R 0%
R/sits_som.R 0%
R/sits_summary.R 0%
R/sits_tae.R 0%
R/sits_tempcnn.R 0%
R/sits_temporal_segmentation.R 0%
R/sits_timeline.R 0%
R/sits_train.R 0%
R/sits_tuning.R 0%
R/sits_uncertainty.R 0%
R/sits_utils.R 12.5%
R/sits_validate.R 0%
R/sits_values.R 0%
R/sits_variance.R 0%
R/sits_view.R 0%
R/sits_xlsx.R 0%
src/combine_data.cpp 0%
src/kernel.cpp 0%
src/label_class.cpp 0%
src/linear_interp.cpp 0%
src/nnls_solver.cpp 0%
src/normalize_data_0.cpp 0%
src/normalize_data.cpp 0%
src/sampling_window.cpp 0%
src/smooth_bayes.cpp 0%
src/smooth_sgp.cpp 0%
src/smooth_whit.cpp 0%
src/smooth.cpp 0%
src/uncertainty.cpp 0%

Cyclocomplexity with cyclocomp

The following function have cyclocomplexity >= 15:

function cyclocomplexity
sits_cube.stac_cube 16

Static code analyses with lintr

lintr found the following 23 potential issues:

message number of times
Avoid library() and require() calls in packages 16
Lines should not be more than 80 characters. 3
Use <-, not =, for assignment. 4


Package Versions

package version
pkgstats 0.1.3.4
pkgcheck 0.1.1.26


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@gilbertocamara
Copy link
Author

Many thanks for your response. Please see below the following explanation, which was included as an "Important Note" in the submission, but maybe it has failed to catch the attention of the reviewers.

  • Explain reasons for any pkgcheck items which your package is unable to pass.
    ✖️ Package name is not available (on CRAN).
    Package in already on CRAN. See
https://cran.r-project.org/web/packages/sits/index.html

✖️ The following function has no documented return value: [sits_filter]
✖️ These functions do not have examples: [plot.sits_cluster, sits_filter, sits_list_collections].
These problems have been fixed in version 1.4.2 of the package, which is available in the "dev" branch on GitHub. To clone the "dev" branch please use the command

git clone https://github.com/e-sensing/sits.git --branch dev

✖️ Package has no HTML vignettes
Instead of preparing vignettes, the authors have written an on-line book that describes the contents of the package in detail. The book is available at the URL

https://e-sensing.github.io/sitsbook/

✖️ Package coverage is 0.1% (should be at least 75%).
Package coverage is actually 95%. Please see

https://app.codecov.io/github/e-sensing/sits/tree/dev/R

sits is a large package. There are more than 1,100 individual tests that take a long time to run. Some of these tests access cloud services, which might be temporarily offline. For this reason, testing needs to be manually enabled. To run the tests, examples, and code coverage, please set the following environment variables in the R session:

Sys.setenv("SITS_RUN_TESTS" = "YES")
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")

We are confident that sits meets the required criteria for ROpenSci review.

We would also like to respond to the lintr message:

Avoid library() and require() calls in packages - 16 

The package imports directly 17 packages, which are required for most functions. It also suggests 33 packages, which are typically used only in a few function, and need to be included only in "as-is" basis. This is based on CRAN policies that restrict the number of imported packages.

@maelle
Copy link
Member

maelle commented Jul 7, 2023

Thank you for your submission @gilbertocamara! As well as your careful response to the automatic checks. We agree with all your responses above.

However, as this a package that implements statistical and ML methods of geospatial data, rather than just the “accessing, manipulating, converting” and converting in our scope, it falls under under our newer statistical peer review program, which has its own time series and [geospatial standards](https://stats-devguide.ropensci.org/standards.html#standards-spatial. Submission requirements are different for this as authors need to document their standards compliance with our code annotation system.

A note - SITS is a package that is very large in scope and code base, as exemplified by the fact that it has a whole book for its documentation. As such, we anticipate that it will be challenging to find reviewers and we will need to give them considerably longer than usual to review the code base and documentation in full. Most of our submissions are not as large or mature at the point of review and up for significant API or architecture changes in response to review. For something in an earlier stage we would likely have suggested breaking functionality up into smaller, more focused packages. Nonetheless, we are up for the challenge if you are up for the higher statistical submission requirements and potential changes.

One last note regarding check results: As of now we can't set any environment variables when running the checks automatically, so they'd have to be set on your side, maybe using the withr::local_envvar() function or similar.

Thanks again! We're happy to answer further questions.

@gilbertocamara
Copy link
Author

Dear @maelle, many thanks for your response. Please see my comments below:

Nonetheless, we are up for the challenge if you are up for the higher statistical submission requirements and potential changes.

Good! Looking at the specific requirements for ROpenSci statistical packages, the sits package meets most of them, such as G.2 (related to data input), G.3 (algorithms), G.4 (output data). We will have to review the sits package carefully as for requirements G.1 (documentation), G.5 (testing) and those for machine learning.

At a first glance, sits complies with requirements SP (spatial software) and TS (time series) and UL (unsupervised learning) . Since these requirements are very detailed, we will carefully review them to ensure compliance. We believe we meet the PD reqs (probability distr).

One last note regarding check results: As of now we can't set any environment variables when running the checks automatically, so they'd have to be set on your side, maybe using the [withr::local_envvar()](https://withr.r-lib.org/reference/with_envvar.html) function or similar.

Allow me to propose an alternative: please consider that the information provided in "codecov.io" to be sufficient to assert that sits meets the code coverage requirements of ROpenSci. If you accept this proposal, it will save us both time and work.

https://app.codecov.io/github/e-sensing/sits/tree/dev/R

We will work on improving sits so that it meets the specifications for ROpenSci statistical packages. We will report back to you when we have a new version that fully meets such specs.

Thanks,
Gilberto

@maelle
Copy link
Member

maelle commented Jul 7, 2023

Thank you! 🎉 Here's a direct link to the author guide for stat submissions: https://stats-devguide.ropensci.org/pkgdev.html

@maelle
Copy link
Member

maelle commented Jul 8, 2023

@gilbertocamara just a clarification: your package will have to comply with one of the category standards of the statistical review system, probably spatial (because time series is for class-based manipulation of time-series data, which is not what your package does as far as I understand).

Probability distributions should be considered an "additional' category that may be complied with in addition to the main categories.

Thank you!

@gilbertocamara

This comment was marked as resolved.

@maelle

This comment was marked as resolved.

@maelle

This comment was marked as resolved.

@gilbertocamara

This comment was marked as resolved.

@gilbertocamara

This comment was marked as resolved.

@maelle
Copy link
Member

maelle commented Jul 11, 2023

@gilbertocamara the CONTRIBUTING guide could link to the book chapter, as long as it's easy to find all information.

I'm still waiting for R CMD check to finish, but examples ran without error (tests now running).

@gilbertocamara
Copy link
Author

Thanks for the tips!

@maelle
Copy link
Member

maelle commented Jul 11, 2023

Tests passed! Now on to trying autotest...

@gilbertocamara
Copy link
Author

Dear @maelle, autotest runs OK in "sits". Now, we have to work on the recommendations. Thanks!

@gilbertocamara
Copy link
Author

Dear @maelle @mpadge I would like to ask for your help to understand how autotest works. As I understand it, autotest run different diagnostics on the functions of a package. It aims to test the resilience of the function to unexpected values of the parameters, for example NA values. It also tries to guess the parameter type from the Rd documentation; here, it tests the function for invalid entries, e.g, numeric inputs for integer parameters. That's important and valuable for software designers.

In the sits package, the authors have been very careful to include pre-conditions for all parameters of all functions. All parameters are checked for valid values, and an error message is provided. However, we are finding there is a mismatch between the error messages provided by sits and those expected by autotest. For us, it is not clear what autotest considers as a valid response.

Consider the following function, which takes as input a set of spatially referenced time series and allows the user to select some of its members. Users can either select a number or a fraction of the series. The relevant part of the code is shown below:

#' @title Sample a percentage of a time series
#' @name sits_sample
#' @author Rolf Simoes, \email{rolf.simoes@@inpe.br}
#'
#' @description Takes a sits tibble with different labels and
#'              returns a new tibble. For a given field as a group criterion,
#'              this new tibble contains a given number or percentage
#'              of the total number of samples per group.
#'              Parameter n: number of random samples.
#'              Parameter frac: a fraction of random samples.
#'              If n is greater than the number of samples for a given label,
#'              that label will be sampled with replacement. Also,
#'              if frac > 1 , all sampling will be done with replacement.
#'
#' @param  data       Sits time series.
#' @param  n          Integer: number of samples to select (range: 1 to nrow(data)).
#' @param  frac       Percentage of samples to pick from each group of data.
#' @param  oversample Oversample classes with small number of samples?
#' @return            A sits tibble with a fixed quantity of samples.
#' @examples
#' # Retrieve a set of time series with 2 classes
#' data(cerrado_2classes)
#' # Print the labels of the resulting tibble
#' summary(cerrado_2classes)
#' # Samples the data set
#' data_100 <- sits_sample(cerrado_2classes, n = 100)
#' # Print the labels
#' summary(data_100)
#' # Sample by fraction
#' data_02 <- sits_sample(cerrado_2classes, frac = 0.2)
#' # Print the labels
#' summary(data_02)
#' @export
sits_sample <- function(data,
                        n = NULL,
                        frac = NULL,
                        oversample = TRUE) {
    # set caller to show in errors
    .check_set_caller("sits_sample")
    # verify if data is valid
    .check_samples_ts(data)
    # verify if either n or frac is informed
    .check_that(
        x = !(purrr::is_null(n) & purrr::is_null(frac)),
        local_msg = "neither 'n' or 'frac' parameters were informed",
        msg = "invalid sample parameters"
    )
    # check oversample
    .check_lgl(oversample)
    # check n and frac parameters
    .check_na(n)
    if (!purrr::is_null(n))
        .check_num(n, allow_na = FALSE, is_integer = TRUE,
                   min = 1, max = nrow(data),
                   len_min = 1, len_max = 1,
                   msg = "invalid n parameter")
    .check_na(frac)
    if (!purrr::is_null(frac))
        .check_num(frac, allow_na = FALSE, is_integer = FALSE,
                   min = 0.0, max = 10.0,
                   len_min = 1, len_max = 1,
                   msg = "invalid frac parameter")

The output for autotest for this function is:

  type  test_name fn_name     parameter parameter_type operation content               test  
1 error NA        sits_sample NA        NA             NA        sits_sample: invalid… TRUE 

In the above the content column is:

sits_sample: invalid n parameter (value is not integer)

We are failing to understand what is being tested by autotest and what is the expected response. As you can see from the code above, we explicitly test for NA and test for the valid values of the input parameters. In principle, we cannot find flaws in the error messages we provide. Please see some examples below.

sits_sample(NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, n = NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, n = 0.3)
> Error: sits_sample: invalid n parameter (value is not integer)

sits_sample(cerrado_2classes, frac = NA)
> Error: sits_sample: invalid 'x' parameter (NA value is not allowed)

sits_sample(cerrado_2classes, frac =  30)
> Error: sits_sample: invalid frac parameter (value should be <= 10)

We are failing to see what we might be doing wrong. What are the expectations of autotest which are not met by our input parameter tests?

We would appreciate your response.

Best
Gilberto

@gilbertocamara
Copy link
Author

Dear @maelle @mpadge

Please, could you explain what appears to be an unexpected behaviour of autotest?

Today, I ran autotest twice on version 1.4.2 (dev) of the sits package. The first response had 16 issues (please see the RDS file in https://www.dropbox.com/s/cu4lpm9vjcgxhmw/autotest_1.rds?dl=0). From what I could understand from the autotest output, it complains about the expected return values of R functions that are called for side-effects.

I tried to fix some of these problems by considering the recommendations of the tidyverse design guide. In Section 26 ("Side-effect functions should return invisibly"), the guide states: "If a function is called primarily for its side-effects, it should invisibly return a useful output. If there’s no obvious output, return the first argument". See more at https://design.tidyverse.org/out-invisible.html.

I am assuming that autotest follows the same guidelines. Thus, I included invisible return values in all sits functions that are called for side-effects. Then, I ran autotest again. To my surprise, it flagged 48 issues. Please see the second autotest output at https://www.dropbox.com/s/43zin4ithlectbz/autotest_2.rds?dl=0.

Could you please help me and explain why autotest increases its number of issues from 16 to 48? Your help will be most appreciated.

*** MWE ***

# install dev version
devtools::install_github("e-sensing/sits@dev")
# enable examples and tests
Sys.setenv("SITS_RUN_EXAMPLES" = "YES")
Sys.setenv("SITS_RUN_TESTS" = "YES")
# first run of autotest
autotest_1 <- autotest::autotest_package(package = "sits", test = TRUE)
# second run of autotest
autotest_2 <- autotest::autotest_package(package = "sits", test = TRUE)

Best regards
Gilberto

@maelle
Copy link
Member

maelle commented Jul 17, 2023

Hello! I'll get to this later this week, thanks for your patience!

@gilbertocamara
Copy link
Author

Dear @maelle @mpadge Begging your indulgence for being insistent, I would like to ask if there is a detailed explanation of the types of diagnostics provided by autotest. Consider the following case. The sits packages deals with big data, processing time series of satellite images. All functions that produce new images need to specify a directory where the results are stored. This is achieved by a parameter called output_dir, which is used in 18 functions, with the same parameter name and the same use.

Out of these 18 instances, autotest produces a diagnostic in only two (2) cases. In both instances, it produces a single_char_case diagnostic. As I understand it, this diagnostic works on the premise that changing the case of a character parameter should yield the same result. Obviously, this expectation cannot be met by operating systems where directory names are case-dependent.

Since the condition to proceed with the revision for statistical packages submitted to ROpenSci is that autotest should not find any problems with the code (no diagnostics, no warnings, no errors), I am at a loss on how to proceed. Please advise on what can be done in this case.

Please also explain why autotest only flags this condition in 2 out of the 18 cases where the parameter output_dir is used.

Many thanks for your help,
Gilberto

@maelle
Copy link
Member

maelle commented Jul 18, 2023

res <- autotest::autotest_package("/home/maelle/Documents/ropensci/SOFTWARE-REVIEW/sits", test = TRUE)
#> Loading required namespace: devtools
#> ℹ Loading sits
#> SITS - satellite image time series analysis.
#> 
#> Loaded sits v1.4.2.
#>         See ?sits for help, citation("sits") for use in publication.
#>         Documentation avaliable in https://e-sensing.github.io/sitsbook/.
#> 
#> ★ Extracting example code from 107 .Rd files
#>   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |======                                                                |   8%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |========                                                              |  11%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  51%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%
#> ✔ Extracted example code
#> ★ Converting 47 examples to yaml
#>   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   2%  |                                                                              |===                                                                   |   4%  |                                                                              |====                                                                  |   6%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  11%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  15%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  19%  |                                                                              |===============                                                       |  21%  |                                                                              |================                                                      |  23%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  28%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  32%  |                                                                              |========================                                              |  34%  |                                                                              |=========================                                             |  36%  |                                                                              |===========================                                           |  38%  |                                                                              |============================                                          |  40%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  45%  |                                                                              |=================================                                     |  47%  |                                                                              |==================================                                    |  49%  |                                                                              |====================================                                  |  51%  |                                                                              |=====================================                                 |  53%  |                                                                              |=======================================                               |  55%  |                                                                              |========================================                              |  57%  |                                                                              |==========================================                            |  60%  |                                                                              |===========================================                           |  62%  |                                                                              |=============================================                         |  64%  |                                                                              |==============================================                        |  66%  |                                                                              |================================================                      |  68%  |                                                                              |=================================================                     |  70%  |                                                                              |===================================================                   |  72%  |                                                                              |====================================================                  |  74%  |                                                                              |======================================================                |  77%  |                                                                              |=======================================================               |  79%  |                                                                              |=========================================================             |  81%  |                                                                              |==========================================================            |  83%  |                                                                              |============================================================          |  85%  |                                                                              |=============================================================         |  87%  |                                                                              |===============================================================       |  89%  |                                                                              |================================================================      |  91%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  96%  |                                                                              |===================================================================== |  98%  |                                                                              |======================================================================| 100%
#> ✔ Converted examples to yaml
#> 
#> ── autotesting sits ──
#> 
#> ✔ [1 / 19]: sits_clean
#> ✔ [2 / 19]: sits_cluster_clean

#> ✔ [3 / 19]: sits_cluster_dendro
#> ✔ [4 / 19]: sits_cluster_frequency
#> ✔ [5 / 19]: sits_config_show
#> ✔ [6 / 19]: sits_labels
#> ✔ [7 / 19]: sits_pred_features
#> ✔ [8 / 19]: sits_pred_normalize
#> ✔ [9 / 19]: sits_pred_references
#> ✔ [10 / 19]: sits_pred_sample
#> ✔ [11 / 19]: sits_predictors
#> ✔ [12 / 19]: sits_reclassify
#> ✔ [13 / 19]: sits_sample
#> ✔ [14 / 19]: sits_select
#> ✔ [15 / 19]: sits_select
#> ✔ [16 / 19]: sits_stats
#> ✔ [17 / 19]: sits_timeline
#> ✔ [18 / 19]: sits_to_csv
#> ✔ [19 / 19]: sits_validate
knitr::kable(res)
type test_name fn_name parameter parameter_type operation content test yaml_hash
error NA sits_clean NA NA normal function call argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error NA sits_clean NA NA NA argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error negate_logical sits_clean progress single logical Negate default value of logical parameter argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error return_successful sits_clean (return object) (return object) error from normal operation argument “cube” is missing, with no default TRUE 5730a40764ee0ff672f0fb3f696ee882
error NA sits_pred_sample NA NA normal function call :group_by(pred, .data[[“label”]]): argument “pred” is missing, with no default TRUE 3f65689ba22d83703d7e8dfea39329c8
error return_successful sits_pred_sample (return object) (return object) error from normal operation argument “pred” is missing, with no default TRUE 3f65689ba22d83703d7e8dfea39329c8
error NA sits_reclassify NA NA normal function call argument “cube” is missing, with no default TRUE 7b320a8a78d142a00c6cd5e03f99106b
error NA sits_reclassify NA NA NA argument “cube” is missing, with no default TRUE 7b320a8a78d142a00c6cd5e03f99106b
error return_successful sits_reclassify (return object) (return object) error from normal operation argument “cube” is missing, with no default TRUE 7b320a8a78d142a00c6cd5e03f99106b
error NA sits_sample NA NA NA sits_sample: invalid value - param is not integer (value is not integer) TRUE 20524a33c207002d7fb57cf205a60f57
error NA sits_select NA NA normal function call sits_select: invalid date format (‘start_date’ and ‘end_date’ should follow year-month-day format: YYYY-MM-DD) TRUE 48668453b1a14d43448243a281745542
error return_successful sits_select (return object) (return object) error from normal operation sits_select: invalid date format (‘start_date’ and ‘end_date’ should follow year-month-day format: YYYY-MM-DD) TRUE 48668453b1a14d43448243a281745542
warning par_is_demonstrated sits_cluster_dendro bands NA Check that parameter usage is demonstrated Examples do not demonstrate usage of this parameter TRUE NA
warning par_is_demonstrated sits_cluster_dendro k NA Check that parameter usage is demonstrated Examples do not demonstrate usage of this parameter TRUE NA
warning par_is_demonstrated sits_validate samples_validation NA Check that parameter usage is demonstrated Examples do not demonstrate usage of this parameter TRUE NA
diagnostic int_range sits_clean window_size single integer Ascertain permissible range Function [sits_clean] does not respond appropriately for specified/default input [window_size = 5] TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic int_range sits_clean memsize single integer Ascertain permissible range Function [sits_clean] does not respond appropriately for specified/default input [memsize = 8] TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic int_range sits_clean multicores single integer Ascertain permissible range Function [sits_clean] does not respond appropriately for specified/default input [multicores = 2] TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean output_dir single character lower-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean output_dir single character upper-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean version single character lower-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic single_char_case sits_clean version single character upper-case character parameter is case dependent TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic return_desc_includes_class sits_clean (return object) (return object) Check whether description of return value specifies class Function [sits_clean] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 5730a40764ee0ff672f0fb3f696ee882
diagnostic vector_custom_class sits_cluster_dendro samples vector Custom class definitions for vector input Function [sits_cluster_dendro] errors on vector columns with different classes when submitted as samples Error message: tryCatch: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro dist_method single character lower-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro dist_method single character upper-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro linkage single character lower-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro linkage single character upper-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro palette single character lower-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic single_char_case sits_cluster_dendro palette single character upper-case character parameter is case dependent TRUE bb0d908f751509f98d150384274f8529
diagnostic random_char_string sits_cluster_dendro palette single character random character string as parameter does not match arguments to expected values TRUE bb0d908f751509f98d150384274f8529
diagnostic single_par_as_length_2 sits_cluster_dendro palette single character Length 2 vector for length 1 parameter Parameter [palette] of function [sits_cluster_dendro] is only used a single character value, but responds to vectors of length > 1 TRUE bb0d908f751509f98d150384274f8529
diagnostic subst_int_for_logical sits_cluster_dendro .plot single logical Substitute integer values for logical parameter (Function call should still work unless explicitly prevented) TRUE bb0d908f751509f98d150384274f8529
diagnostic return_desc_includes_class sits_cluster_dendro (return object) (return object) Check whether description of return value specifies class Function [sits_cluster_dendro] returns a value of class [sits_cluster, sits, tbl_df, tbl, data.frame], which differs from the value provided in the description TRUE bb0d908f751509f98d150384274f8529
diagnostic vector_custom_class sits_labels data vector Custom class definitions for vector input Function [sits_labels] errors on vector columns with different classes when submitted as data Error message: cannot coerce class ‘“different”’ to a data.frame TRUE 19a24496601a7a343f58f03b8bb428f0
diagnostic return_desc_includes_class sits_pred_sample (return object) (return object) Check whether description of return value specifies class Function [sits_pred_sample] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 3f65689ba22d83703d7e8dfea39329c8
diagnostic vector_custom_class sits_predictors samples vector Custom class definitions for vector input Function [sits_predictors] errors on vector columns with different classes when submitted as samples Error message: tryCatch: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE 1f67e542daa971702a44e33b80cbf7df
diagnostic single_char_case sits_reclassify rules single character lower-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify rules single character upper-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic int_range sits_reclassify memsize single integer Ascertain permissible range Function [sits_reclassify] does not respond appropriately for specified/default input [memsize = 4] TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic int_range sits_reclassify multicores single integer Ascertain permissible range Function [sits_reclassify] does not respond appropriately for specified/default input [multicores = 2] TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify output_dir single character lower-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify output_dir single character upper-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify version single character lower-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic single_char_case sits_reclassify version single character upper-case character parameter is case dependent TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic return_desc_includes_class sits_reclassify (return object) (return object) Check whether description of return value specifies class Function [sits_reclassify] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 7b320a8a78d142a00c6cd5e03f99106b
diagnostic vector_custom_class sits_sample data vector Custom class definitions for vector input Function [sits_sample] errors on vector columns with different classes when submitted as data Error message: sits_sample: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE 20524a33c207002d7fb57cf205a60f57
diagnostic int_range sits_sample n single integer Ascertain permissible range Parameter [n] defines only one positive or negative limit; plese either specify both lower and upper limits, or that values must be ‘positive’ or ‘negative’ TRUE 20524a33c207002d7fb57cf205a60f57
diagnostic vector_custom_class sits_select data vector Custom class definitions for vector input Function [sits_select] errors on vector columns with different classes when submitted as data Error message: cannot coerce class ‘“different”’ to a data.frame TRUE aa507da50025d46879f398dbdd2851a9
diagnostic single_char_case sits_select bands single character lower-case character parameter is case dependent TRUE aa507da50025d46879f398dbdd2851a9
diagnostic vector_custom_class sits_select data vector Custom class definitions for vector input Function [sits_select] errors on vector columns with different classes when submitted as data Error message: sits_select: invalid date format (‘start_date’ and ‘end_date’ should follow year-month-day format: YYYY-MM-DD) TRUE 48668453b1a14d43448243a281745542
diagnostic return_desc_includes_class sits_select (return object) (return object) Check whether description of return value specifies class Function [sits_select] returns a value of class [simpleError, error, condition], which differs from the value provided in the description TRUE 48668453b1a14d43448243a281745542
diagnostic vector_custom_class sits_timeline data vector Custom class definitions for vector input Function [sits_timeline] errors on vector columns with different classes when submitted as data Error message: cannot coerce class ‘“different”’ to a data.frame TRUE 61a55bad554b362221b02d495c3015f7
diagnostic vector_custom_class sits_to_csv data vector Custom class definitions for vector input Function [sits_to_csv] errors on vector columns with different classes when submitted as data Error message: sits_metadata_to_csv: invalid samples file (all(.conf(“df_sample_columns”) %in% colnames(data)) is not TRUE) TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic single_char_case sits_to_csv file single character lower-case character parameter is case dependent TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic single_char_case sits_to_csv file single character upper-case character parameter is case dependent TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic random_char_string sits_to_csv file single character random character string as parameter does not match arguments to expected values TRUE 6f7afd7299bacb77f4656d9672c7e46b
diagnostic return_desc_includes_class sits_to_csv (return object) (return object) Check whether description of return value specifies class Function [sits_to_csv] returns a value of class [sits, tbl_df, tbl, data.frame], which differs from the value provided in the description TRUE 6f7afd7299bacb77f4656d9672c7e46b
message NA sits_cluster_dendro NA NA normal function call calculating dendrogram… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call finding the best cut… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call best number of clusters = 6 TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call best height for cutting the dendrogram = 20.3965461960775 TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call cutting the tree… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call Plotting dendrogram… TRUE bb0d908f751509f98d150384274f8529
message NA sits_cluster_dendro NA NA normal function call result is a tibble with cluster indexes… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter calculating dendrogram… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter finding the best cut… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter best number of clusters = 6 TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter best height for cutting the dendrogram = 20.3965461960775 TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter cutting the tree… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter Plotting dendrogram… TRUE bb0d908f751509f98d150384274f8529
message negate_logical sits_cluster_dendro .plot single logical Negate default value of logical parameter result is a tibble with cluster indexes… TRUE bb0d908f751509f98d150384274f8529

Created on 2023-07-18 with reprex v2.0.2

@maelle
Copy link
Member

maelle commented Jul 18, 2023

@gilbertocamara regarding the output you mentioned in your comment #596 (comment), can you confirm it's gone? I don't see that exact error (I'm going through your comments chronologically).

@gilbertocamara
Copy link
Author

Dear @maelle, above you have shown the autotest output without running the actual tests. My comments above refer to the output with the parameter test set to TRUE. This is the result that counts.

@maelle
Copy link
Member

maelle commented Jul 18, 2023

I would like to ask if there is a detailed explanation of the types of diagnostics provided by autotest

Good question. To me the best answer is currently https://docs.ropensci.org/autotest/reference/autotest_types.html, does it help? I opened an issue in autotest because I agree the documentation could be improved on this front ropensci-review-tools/autotest#83

@maelle
Copy link
Member

maelle commented Jul 18, 2023

currently actually running the tests 😅 sorry about that

@maelle
Copy link
Member

maelle commented Jul 18, 2023

Regarding the flagging of 2/18 functions, obviously I'll have a better idea once I have the results locally, but since autotest works by scraping examples, this might be due to different examples in these 2 functions?

@maelle
Copy link
Member

maelle commented Jul 18, 2023

I updated the results I get. Are they the same as on your machine @gilbertocamara?

@gilbertocamara
Copy link
Author

Unfortunately, no. Please wait a little bit. Yesterday, we made some changes to sits trying to match the expectations of autotest. We are currently running the latest test. Please give me until the end of the morning BRT to provide you with an update.

@maelle
Copy link
Member

maelle commented Jul 18, 2023

Ok. My answer might have a few days delay depending on my availability but I'll do my best. I'll re-read autotest docs. 😁

@mpadge
Copy link
Member

mpadge commented Jul 19, 2023

@gilbertocamara Please accept my apologies as lead developer of autotest for issues here. I am currently away on holidays until start of August. Thank you for taking the autotest procedures and results so seriously, but please note that autotest is currently recommended and not required infrastructure. As such, its output should currently be considered (nothing more than) a useful guide to increase general robustness and documentation of packages prior to submission. It is not necessary for packages to completely pass autotest in order for submissions to proceed.

In short: Please use autotest to help improve your package as much as possible. Once you are satisfied, feel free to ignore any remaining autotest issues and proceed on to documentation of statistical standards compliance.

That said, please also feel free to open any issues in the autotest repository, or to ask any further questions there. The package will undergo a major revision hopefully sometime later this year, which will include numerous improvements in functionality, documentation, and general useability. Again, thank you for engaging so sincerely with these results, and apologies for any confusion during the process.

@gilbertocamara
Copy link
Author

Dear @mpadge Many thanks for your response, even during your holidays. We will follow your recommendations and proceed to SRR once we consider we have followed all the relevant recommendations of autotest.

@mpadge
Copy link
Member

mpadge commented Feb 21, 2024

Any updates @gilbertocamara? As said, don't worry too much (or indeed at all) about autotest, but we would like to proceed with your submission 👍

@gilbertocamara
Copy link
Author

Dear @mpadge Apologies for the long delay in responding. First of all, kudos to ROpenSci for your work! In the last two months, we have been working on release 1.5.0 of the sits package which is due April 30th. In this release, we have tried to incorporate in sits the guidelines for Statistical Software proposed by ROpenSci in connection with the srr package. In particular, we took the guidelines associated with machine learning, spatial and time series packages.

A key point here is that arguably sits is currently unique in the R package landscape. It provides an end-to-end environment for ML/DL analysis of big Earth observation. We are not aware of any similar package. Thus, many recommendations that would apply for packages that implement improved versions of existing ML algorithms do not apply. A second point is that there is a full book on sits available in https://e-sensing.github.io/sitsbook/, which allows users to perform extended tests and experiments with medium-sized datasets that cannot be loaded in CRAN. We also provide large data sets in github to serve as basis for user experiments.

The sits package is supported by a set of packages available in the github repository (https://github.com/e-sensing) which include sitsdata (medium-sized data sets used in the book) and rondonia20LMR a data set of 28 GB to test ML methods for image time series in a big data context.

Overall, we found the guidelines to be very useful. To avoid lengthy issues, I will post my thoughts on the SRR guidelines in a set of comments below

@gilbertocamara
Copy link
Author

Dear @mpadge Some comments on the SRR Generic Guidelines.

Generic guidelines are quite good points, especially regarding documentation and error messages associated with variable checking. They encouraged us to:

(a) improve documentation of internal functions (G1.4a);
(b) provide a CONTRIBUTING.md statement (G1.2);
(c) include assertions on all input parameters (G2.0);
(d) handle missing data and provide explicit imputation function (G2.14);
(e) include specific messages for each different error, including indicating parameter names (G5.2);
(f) include tests that support a 94% code coverage (G5.4);
(g) include edge-condition tests and associated messages (G5.8);

We could not fully understand the scope of G5.6 (parameter recovery) and G5.9 (noise susceptibility tests) so we consider that they do not apply to `sits.

We missed explicit support in the SRR Guidelines regarding the tidyverse. While we understand that there is resistance to tidyverse in certain quarters, in software engennering terms the tidyverse is much better than tradional R *apply methods for data handling. We could not have developed a reliable and efficient package without the tidyverse.

@gilbertocamara
Copy link
Author

Dear @mpadge comments on SRR Guidelines on Machine Learning

From the perspective of the sits package and Earth observation in general, the first part of the ML guidelines (ML1.0 to ML1.5) seems to have as an excessive focus on the differentiation between training and test data. In the case of EO data, R packages have to deal with CRAN limitations on example data sets. We tried to overcome this limitation by providing a specific chapter in the on-line book (https://e-sensing.github.io/sitsbook/validation-and-accuracy-measurements.html) and with additional packages with are available in github, as explained above.

Guidelines ML1.6 to ML1.8 deal with missing values. They were useful for us as reminders, taking in account that missing values in EO data arise in a different context than tabular data.

We consider guideline ML2.0 quite important. In sits, we actually developed a single interface which encapsulates different models using closures. In fact, the guideline ML2.0 is better elaborated in ML4.0 and later in ML5.0. Could one consider merging them?

We also agree with ML2.2, although we have not yet implemented it. That said, we did not understand the difference between guidelines ML2.2, ML2.3, and ML2.4, ML2.5, and ML2.6. Perhaps they could be grouped together for brevity's sake. We did not understand the context of ML3.0 and its subpoints. We fail to see in which context such separation between specification and training might be useful. This is probably due to the specific nature of EO data analysis.

As for item ML3.4 and subitems, we consider that requiring developers to provide functions for tuning hyperparameters might be a better approach, especially with deep learning.

Although we understand the rationale for ML6.0, we advise against using training and test data for model assessment in the case of EO data. The community has developed a specific set of best practices of quality assessment. See Olofsson et al., (2014) doi:10.1016/j.rse.2014.02.015.

In short, the specific case of applying ML/DL for EO data has issues which are impossible to cover in the scope of generic guidelines as provided in SSR for ML.

@gilbertocamara
Copy link
Author

Dear @mpadge Comments on SRR Guidelines on Spatial and Time Series.

The SSR Spatial Guidelines are very good and generally applicable. The emphasis on the sf package (SP2.1) is welcome. However, we missed guidelines on handling raster data. In our work, we found that terra to be better and easier to use than stars. In any case, SRR should discourage the use of raster as these packages have superseded it. We also missed guidelines regarding visualisation of vector and raster data. For your reference, we found that tmap, leaflet and leafem to be excellent packages. Note that both tmap and leafem require raster data to be handled by stars. In sits, we use stars for plotting and visualisation, and terra for access to raster values.

We suggest the inclusion of a guideline regarding the installation of GDAL and PROJ, following the instructions associated with the sf package. See more at https://r-spatial.github.io/sf/#installing. We also suggest that you consider mentioning the desirability of combining sf with the tidyverse. As acknowledged by Edzer Pebesma, the design of sfhas been influenced by the tidyverse to the extent that some functions for tidyverse can be applied to the output of sf ones. Thus, arguably sf users will find it easier to combine it with the tidyverse.

As for time series, we fully agree with guidelines TS1.0 to T2.1c and have implemented them in sits. As for guidelines TS2.2 to TS2.4b, we considered they do not apply, since in general time series derived from satellite data are not stationary. Also, since satellite image series analysis is about classification and prediction rather than forecasting, we considered that guidelines TS3.0 to TS4.7c do not apply for sits.

@gilbertocamara
Copy link
Author

gilbertocamara commented Apr 27, 2024

Dear @mpadge Final considerations and a plea for support.

Overall, the SRR Guidelines deserve high praise. The ROpenSci team has provided an excellent service to the community by working hard to develop them. While the guidelines are aimed at small, focused R packages, they are also relevant to larger packages such as sits.

We thus would like your advice on how to proceed. We still consider that a software review of sits would be of much value to us. However, we recognize that sits may fall outside of the scope of the ROpenSci review process. If you are willing to go ahead and revise the package, we would be most appreciative. Should you consider that such review would be cumbersome to the ROpenSci community, we will fully understand your position.

Whathever the case, warm congratulations and thanks to the ROpenSci community!

@mpadge
Copy link
Member

mpadge commented May 2, 2024

@gilbertocamara Thank you so much for your considered and very deep engagement with our statistical standards. The first thing I would like to ask would be for you to copy the above comments into separate issues within the repository for our Statistical Standards Book - one for the Machine Learning and one for Spatial standards. We'll then incorporate your excellent feedback there via updates to our standards.

We are definitely keen to progress with peer-review here. The sits package is definitely within scope. I imagine the only problems sits may pose will be extra burden on reviewers of such a very large and comprehensive package. But we are definitely excited to learn from guiding this kind of package through our system, and from hopefully mutually beneficial feedback from both sides throughout the process.

Our current Editor-in-Chief @jooolia will take it from here. Thank you for all of your work!

@gilbertocamara
Copy link
Author

Dear @mpadge Many thanks for your response. I will include the relevant part of my comments as issues in the github repository for the book. I shall be waiting for instructions from @jooolia on how to proceed.

@jooolia jooolia self-assigned this May 5, 2024
@jooolia
Copy link
Member

jooolia commented May 5, 2024

@ropensci-review-bot check srr

@jooolia
Copy link
Member

jooolia commented May 5, 2024

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for sits (v1.4.2-3)

git hash: 06ab1b32

  • ✖️ Package name is not available (on CRAN).
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✖️ Package has no HTML vignettes
  • ✖️ These functions do not have examples: [sits_run_examples, sits_run_tests].
  • ✔️ Package has continuous integration checks.
  • ✖️ Package coverage is 0.1% (should be at least 75%).
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL-2


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 2176
internal sits 154
internal datasets 6
internal tools 2
imports utils 282
imports stats 173
imports purrr 151
imports dplyr 131
imports sf 70
imports tibble 66
imports slider 61
imports torch 60
imports rstac 58
imports graphics 53
imports grDevices 42
imports terra 42
imports tidyr 41
imports lubridate 21
imports yaml 5
imports sysfonts 1
imports gdalUtilities NA
imports parallel NA
imports Rcpp NA
imports showtext NA
suggests ggplot2 55
suggests digest 20
suggests leaflet 16
suggests httr 14
suggests tmap 14
suggests stars 12
suggests luz 11
suggests mgcv 5
suggests xgboost 5
suggests dtwclust 4
suggests stringr 4
suggests caret 3
suggests FNN 3
suggests scales 3
suggests kohonen 2
suggests methods 2
suggests openxlsx 2
suggests DiagrammeR 1
suggests e1071 1
suggests exactextractr 1
suggests gdalcubes 1
suggests randomForest 1
suggests randomForestExplainer 1
suggests spdep 1
suggests cli NA
suggests dendextend NA
suggests future NA
suggests geojsonsf NA
suggests jsonlite NA
suggests leafem NA
suggests nnet NA
suggests RcppArmadillo NA
suggests supercells NA
suggests testthat NA
suggests torchopt NA
suggests covr NA
linking_to Rcpp NA
linking_to RcppArmadillo NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (201), source (190), c (185), labels (122), nrow (71), file (68), length (67), scale (45), unlist (45), names (42), row (41), seq_len (37), sum (36), unique (36), date (35), unname (31), col (30), lapply (27), as.numeric (26), paste (26), q (26), toupper (24), for (23), paste0 (23), class (19), options (18), return (18), seq_along (18), as.Date (16), ceiling (16), apply (15), dim (15), round (15), matrix (14), ncol (14), as.character (13), environment (13), by (12), format (12), rep (12), Sys.getenv (12), which (12), colnames (11), try (11), as.matrix (10), do.call (10), mean (10), sqrt (10), suppressMessages (10), tryCatch (10), factor (9), url (9), array (8), floor (8), rbind (8), sample (8), setdiff (8), substitute (8), t (8), table (8), seq (7), strsplit (7), version (7), as.integer (6), cbind (6), character (6), data.frame (6), formals (6), gamma (6), if (6), levels (6), structure (6), system.file (6), which.max (6), which.min (6), any (5), replace (5), suppressWarnings (5), switch (5), Sys.time (5), abs (4), as.list (4), choose (4), get (4), grepl (4), inherits (4), is.na (4), list2env (4), mode (4), readRDS (4), rev (4), row.names (4), split (4), tempdir (4), vapply (4), args (3), colSums (3), diag (3), eval (3), missing (3), path.expand (3), prop.table (3), raw (3), socketSelect (3), tolower (3), as.factor (2), cut (2), debug (2), deparse (2), double (2), gsub (2), I (2), is.logical (2), list.files (2), log (2), logical (2), max.col (2), parent.env (2), pmin (2), quote (2), range (2), rawConnection (2), readLines (2), rowSums (2), signif (2), subset (2), substring (2), tapply (2), within (2), write (2), all (1), all.vars (1), as.vector (1), basename (1), cat (1), difftime (1), dir (1), dir.exists (1), dirname (1), drop (1), exists (1), exp (1), file.exists (1), gc (1), integer (1), is.null (1), kappa (1), lengths (1), new.env (1), numeric (1), open (1), order (1), parent.frame (1), rbind.data.frame (1), remove (1), sample.int (1), sort (1), srcfile (1), summary (1), svd (1), sys.frames (1), tempfile (1), vector (1)

utils

data (267), vi (4), txtProgressBar (3), head (2), write.csv (2), getTxtProgressBar (1), read.csv (1), read.delim (1), tail (1)

stats

ts (44), offset (33), formula (14), D (8), kernel (8), time (8), predict (5), df (4), dt (4), na.action (4), quantile (4), runif (4), sd (4), weights (3), as.formula (2), end (2), median (2), rbeta (2), rf (2), rlnorm (2), rnorm (2), start (2), addmargins (1), as.dendrogram (1), cutree (1), family (1), filter (1), reshape (1), smooth (1), step (1), var (1), window (1)

sits

sits_labels (27), sits_bands (26), sits_timeline (19), sits_select (10), train_fun (8), sits_bbox (4), C_mask_na (3), mixture_fn (3), uncert_fn (3), C_label_max_prob (2), C_max_sampling (2), comb_fn (2), download_fn (2), filter_call (2), label_fn (2), reclassify_fn (2), sits_accuracy (2), sits_train (2), smooth_fn (2), .debug (1), C_entropy_probs (1), C_fill_na (1), C_kernel_max (1), C_kernel_mean (1), C_kernel_median (1), C_kernel_min (1), C_kernel_modal (1), C_kernel_sd (1), C_kernel_var (1), C_least_probs (1), C_margin_probs (1), C_nnls_solver_batch (1), C_normalize_data (1), C_normalize_data_0 (1), choice (1), create_iqr (1), impute_fun (1), plot_samples (1), replace_na (1), result_fun (1), sits_apply (1), sits_as_sf (1), sits_classify (1), sits_clean (1), sits_label_classification (1), sits_som_map (1), sits_validate (1), sits_variance (1), sits_view.sits (1), submit (1)

purrr

map (71), map_chr (20), map_dfr (18), map_lgl (9), map2_dfr (6), pmap (6), map_dfc (5), pmap_dfr (5), is_character (2), map_dbl (2), map_int (2), imap_dfc (1), map2 (1), pmap_chr (1), pmap_lgl (1), transpose (1)

dplyr

mutate (26), filter (18), all_of (14), bind_rows (12), bind_cols (9), select (6), n (5), across (4), left_join (4), slice_sample (4), group_by (3), slice_head (3), summarise (3), c_across (2), distinct (2), full_join (2), select_if (2), starts_with (2), tibble (2), cur_group_id (1), group_split (1), inner_join (1), matches (1), pull (1), relocate (1), rename (1), transmute (1)

sf

st_crs (10), st_transform (9), st_sample (7), st_as_sf (5), st_write (5), st_read (4), read_sf (3), st_drop_geometry (3), st_intersects (3), st_bbox (2), st_geometry (2), st_geometry_type (2), st_intersection (2), st_sf (2), st_sfc (2), st_as_sfc (1), st_contains (1), st_convex_hull (1), st_coordinates (1), st_distance (1), st_is_empty (1), st_multipoint (1), st_polygon (1), st_within (1)

tibble

tibble (49), as_tibble_row (10), as_tibble (6), lst (1)

slider

slide (25), slide_dfr (23), slide_dbl (4), slide_lgl (4), slide2_dfr (3), slide_chr (1), slide2_lgl (1)

torch

nn_module (20), torch_tensor (11), nn_cross_entropy_loss (5), nn_softmax (5), nn_batch_norm1d (4), torch_matmul (3), torch_zeros (3), nn_init_normal_ (2), torch_stack (2), torch_transpose (2), nn_dropout (1), torch_exp (1), torch_mean (1)

rstac

post_request (19), items_fetch (13), items_reap (11), items_matched (6), sign_planetary_computer (5), ext_query (2), stac (1), stac_search (1)

ggplot2

ggplot (11), aes (8), element_text (7), scale_fill_manual (5), geom_line (3), labs (3), theme (3), element_blank (2), element_rect (2), facet_grid (2), guide_legend (2), position_dodge (2), facet_wrap (1), guides (1), scale_color_brewer (1), scale_x_date (1), scale_x_log10 (1)

graphics

segments (14), title (13), points (11), legend (9), plot (3), abline (2), grid (1)

grDevices

colors (30), palette (10), hcl.colors (2)

terra

ext (4), rast (4), xFromCol (4), yFromRow (4), readValues (3), crop (2), crs (2), extract (2), freq (2), spatSample (2), fileBlocksize (1), NAflag (1), ncol (1), nlyr (1), nrow (1), readStart (1), values (1), xmax (1), xmin (1), xres (1), ymax (1), ymin (1), yres (1)

tidyr

nest (15), unnest (14), pivot_longer (6), expand_grid (3), starts_with (2), everything (1)

lubridate

as_date (16), fast_strptime (1), mday (1), month (1), year (1), ymd (1)

digest

digest (20)

leaflet

layersControlOptions (12), colorFactor (2), leaflet (1), providers$GeoportailFrance.orthos (1)

httr

add_headers (6), write_disk (3), parse_url (2), build_url (1), content (1), GET (1)

tmap

tm_borders (7), tm_shape (5), tm_facets (1), tmap_options (1)

stars

read_stars (8), st_warp (4)

luz

luz_metric_accuracy (5), setup (5), luz_callback_early_stopping (1)

datasets

trees (6)

mgcv

gam (3), predict.gam (2)

xgboost

xgb.plot.tree (3), xgb.DMatrix (1), xgb.train (1)

yaml

yaml.load_file (4), as.yaml (1)

dtwclust

hierarchical_control (2), cvi (1), tsclust (1)

stringr

str_wrap (3), str_count (1)

caret

confusionMatrix (3)

FNN

knnx.index (3)

scales

date_format (1), label_number (1), pretty_breaks (1)

kohonen

somgrid (1), supersom (1)

methods

S3Part (2)

openxlsx

createWorkbook (1), saveWorkbook (1)

tools

file_path_sans_ext (2)

DiagrammeR

render_graph (1)

e1071

svm (1)

exactextractr

exact_extract (1)

gdalcubes

image_collection (1)

randomForest

randomForest (1)

randomForestExplainer

plot_min_depth_distribution (1)

spdep

poly2nb (1)

sysfonts

font_add_google (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in C++ (6% in 15 files) and R (94% in 136 files)
  • 8 authors
  • no vignette
  • 4 internal data files
  • 20 imported packages
  • 236 exported functions (median 11 lines of code)
  • 2360 non-exported functions in R (median 7 lines of code)
  • 92 R functions (median 11 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 136 99.5
files_src 15 95.4
files_vignettes 0 0.0 TRUE
files_tests 50 99.2
loc_R 21468 99.7 TRUE
loc_src 1304 68.0
loc_tests 6162 98.6 TRUE
num_vignettes 0 0.0 TRUE
data_size_total 186830 86.9
data_size_median 50517 88.2
n_fns_r 2596 99.9 TRUE
n_fns_r_exported 236 98.9 TRUE
n_fns_r_not_exported 2360 99.9 TRUE
n_fns_src 92 78.3
n_fns_per_file_r 10 86.8
n_fns_per_file_src 6 56.2
num_params_per_fn 3 33.6
loc_per_fn_r 7 16.0
loc_per_fn_r_exp 11 25.1
loc_per_fn_r_not_exp 7 18.0
loc_per_fn_src 11 28.5
rel_whitespace_R 5 96.7 TRUE
rel_whitespace_src 12 60.1
rel_whitespace_tests 12 97.1 TRUE
doclines_per_fn_exp 68 79.0
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 4282 99.6 TRUE

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml

GitHub Workflow Results

id name conclusion sha run_number date
8933087450 R-CMD-check success 1ccc1c 358 2024-05-03

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking installed package size ... NOTE
    installed size is 18.4Mb
    sub-directories of 1Mb or more:
    extdata 1.7Mb
    libs 14.8Mb
    R 1.1Mb

R CMD check generated the following check_fail:

  1. rcmdcheck_reasonable_installed_size

Test coverage with covr

Package coverage: 0.06

The following files are not completely covered by tests:

file coverage
R/api_accessors.R 0%
R/api_accuracy.R 0%
R/api_apply.R 0%
R/api_band.R 0%
R/api_bbox.R 0%
R/api_block.R 0%
R/api_check.R 0%
R/api_chunks.R 0%
R/api_classify.R 0%
R/api_clean.R 0%
R/api_cluster.R 0%
R/api_colors.R 0%
R/api_combine_predictions.R 0%
R/api_comp.R 0%
R/api_conf.R 0%
R/api_csv.R 0%
R/api_cube.R 0%
R/api_data.R 0%
R/api_debug.R 0%
R/api_download.R 0%
R/api_factory.R 0%
R/api_file_info.R 0%
R/api_file.R 0%
R/api_gdal.R 0%
R/api_gdalcubes.R 0%
R/api_imputation.R 0%
R/api_jobs.R 0%
R/api_label_class.R 0%
R/api_mixture_model.R 0%
R/api_ml_model.R 0%
R/api_mosaic.R 0%
R/api_parallel.R 0%
R/api_period.R 0%
R/api_plot_raster.R 0%
R/api_plot_time_series.R 0%
R/api_plot_vector.R 0%
R/api_point.R 0%
R/api_predictors.R 0%
R/api_raster_sub_image.R 0%
R/api_raster_terra.R 0%
R/api_raster.R 0%
R/api_reclassify.R 0%
R/api_regularize.R 0%
R/api_roi.R 0%
R/api_s2tile.R 0%
R/api_samples.R 0%
R/api_segments.R 0%
R/api_sf.R 0%
R/api_shp.R 0%
R/api_signal.R 0%
R/api_smooth.R 0%
R/api_smote.R 0%
R/api_som.R 0%
R/api_source_aws.R 0%
R/api_source_bdc.R 0%
R/api_source_deafrica.R 0%
R/api_source_hls.R 0%
R/api_source_local.R 0%
R/api_source_mpc.R 0%
R/api_source_sdc.R 0%
R/api_source_stac.R 0%
R/api_source_usgs.R 0%
R/api_source.R 0%
R/api_space_time_operations.R 0%
R/api_stac.R 0%
R/api_stats.R 0%
R/api_summary.R 0%
R/api_tibble.R 0%
R/api_tile.R 0%
R/api_timeline.R 0%
R/api_torch.R 0%
R/api_ts.R 0%
R/api_tuning.R 0%
R/api_uncertainty.R 0%
R/api_utils.R 0%
R/api_values.R 0%
R/api_variance.R 0%
R/api_vector_info.R 0%
R/api_vector.R 0%
R/api_view.R 0%
R/sits_accuracy.R 0%
R/sits_active_learning.R 0%
R/sits_apply.R 0%
R/sits_bands.R 0%
R/sits_bbox.R 0%
R/sits_classify.R 0%
R/sits_clean.R 0%
R/sits_cluster.R 0%
R/sits_colors.R 0%
R/sits_combine_predictions.R 0%
R/sits_config.R 0%
R/sits_csv.R 0%
R/sits_cube_copy.R 0%
R/sits_cube.R 0%
R/sits_factory.R 0%
R/sits_filters.R 0%
R/sits_geo_dist.R 0%
R/sits_get_data.R 0%
R/sits_label_classification.R 0%
R/sits_labels.R 0%
R/sits_lighttae.R 0%
R/sits_machine_learning.R 0%
R/sits_merge.R 0%
R/sits_mixture_model.R 0%
R/sits_mlp.R 0%
R/sits_model_export.R 0%
R/sits_mosaic.R 0%
R/sits_patterns.R 0%
R/sits_plot.R 0%
R/sits_predictors.R 0%
R/sits_reclassify.R 0%
R/sits_regularize.R 0%
R/sits_resnet.R 0%
R/sits_sample_functions.R 0%
R/sits_segmentation.R 0%
R/sits_select.R 0%
R/sits_sf.R 0%
R/sits_smooth.R 0%
R/sits_som.R 0%
R/sits_summary.R 0%
R/sits_tae.R 0%
R/sits_tempcnn.R 0%
R/sits_timeline.R 0%
R/sits_train.R 0%
R/sits_tuning.R 0%
R/sits_uncertainty.R 0%
R/sits_utils.R 12.5%
R/sits_validate.R 0%
R/sits_variance.R 0%
R/sits_view.R 0%
R/sits_xlsx.R 0%
src/combine_data.cpp 0%
src/kernel.cpp 0%
src/label_class.cpp 0%
src/linear_interp.cpp 0%
src/nnls_solver.cpp 0%
src/normalize_data_0.cpp 0%
src/normalize_data.cpp 0%
src/sample_points.cpp 0%
src/sampling_window.cpp 0%
src/smooth_bayes.cpp 0%
src/smooth_sgp.cpp 0%
src/smooth_whit.cpp 0%
src/smooth.cpp 0%
src/uncertainty.cpp 0%

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function cyclocomplexity
sits_select.raster_cube 18
sits_classify.raster_cube 15

Static code analyses with lintr

lintr found the following 45 potential issues:

message number of times
Avoid library() and require() calls in packages 17
Lines should not be more than 80 characters. 23
Use <-, not =, for assignment. 5


Package Versions

package version
pkgstats 0.1.3.13
pkgcheck 0.1.2.21


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@gilbertocamara
Copy link
Author

Dear @jooolia

Many thanks for starting the review process of R package sits.

Regarding the items mentioned:

✖️ Package name is not available (on CRAN):
version 1.5.0 will be submitted to CRAN tomorrow. I will inform you when it is accepted.

✖️ Package has no HTML vignettes:
The package has a full on-line book (see https://e-sensing.github.io/sitsbook/) and so there is no need for HTML vignettes.

✖️ These functions do not have examples: [sits_run_examples, sits_run_tests].
These functions are auxiliary functions, to avoid CRAN checks.

✖️ Package coverage is 0.1% (should be at least 75%).
In fact, package coverage is 94% (see https://app.codecov.io/gh/e-sensing/sits). SITS includes functions that access cloud services and functions which take a long time to run. We access seven cloud services and some of them may be off-line. For this reason, tests and examples are run off-line. To run tests and examples, please include the following environmental variables:

Sys.setenv(SITS_RUN_TESTS = "YES")
Sys.setenv(SITS_RUN_EXAMPLES = "YES")

The issues raised by lintr and cyclocomp have been addressed in version 1.5.0

@gilbertocamara
Copy link
Author

Dear @jooola, version 1.5.0 of sits is now on CRAN. Whenever you feel appropriate, you can start the software review of the package. Please note that all issues raised by the automated bot have been responded above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants