You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I agree to follow this project's Contributing Guidelines.
Description
A modified version of the README example:
library(magrittr)
library(data.validator)
report <- data_validation_report()
validate(mtcars, name = "Verifying cars dataset") %>%
validate_if(drat > 0, description = "Column drat has only positive values") %>%
validate_cols(in_set(c(0, 2)), WRONG_COLUMN_NAME, vs, am, description = "vs and am values equal 0 or 2 only") %>%
validate_cols(within_n_sds(1), mpg, description = "mpg within 1 sds") %>%
validate_rows(num_row_NAs, within_bounds(0, 2), vs, am, mpg, description = "not too many NAs in rows") %>%
validate_rows(maha_dist, within_n_mads(10), everything(), description = "maha dist within 10 mads") %>%
add_results(report)
print(report)
The error:
> validate(mtcars, name = "Verifying cars dataset") %>%
+ validate_if(drat > 0, description = "Column drat has only positive values") %>%
+ validate_cols(in_set(c(0, 2)), WRONG_COLUMN_NAME, vs, am, description = "vs and am values equal 0 or 2 only") %>%
+ validate_cols(within_n_sds(1), mpg, description = "mpg within 1 sds") %>%
+ validate_rows(num_row_NAs, within_bounds(0, 2), vs, am, mpg, description = "not too many NAs in rows") %>%
+ validate_rows(maha_dist, within_n_mads(10), everything(), description = "maha dist within 10 mads") %>%
+ add_results(report)
Error in `dplyr::select()` at assertr/R/assertions.R:102:2:
! Can't subset columns that don't exist.
✖ Column `WRONG_COLUMN_NAME` doesn't exist.
As far as I can tell, if the user provides a table in which a validated column doesn't exist, then the validate workflow throws an error instead of producing a report stating validation failed due to missing required columns.
Problem
No checks that the validated columns exist in the provided data.frame.
So, the column-exists check must be placed outside of the generate-validation-report workflow.
The feedback to the user is then split into at least 2 validations: 1) a check for the required columns and 2) the validation report -- instead of just one all-encompassing validation report.
Proposed Solution
Include assertr::has_all_names in the validation report, or if that is already possible, provide an example in the package README.
Alternatives Considered
I'm currently validating the existence of the required columns prior to using data.validator, and providing user feedback on the column existence via shiny::showNotification()
The text was updated successfully, but these errors were encountered:
Guidelines
Description
A modified version of the README example:
The error:
As far as I can tell, if the user provides a table in which a validated column doesn't exist, then the
validate
workflow throws an error instead of producing a report statingvalidation failed due to missing required columns
.Problem
No checks that the validated columns exist in the provided data.frame.
So, the column-exists check must be placed outside of the generate-validation-report workflow.
The feedback to the user is then split into at least 2 validations: 1) a check for the required columns and 2) the validation report -- instead of just one all-encompassing validation report.
Proposed Solution
Include
assertr::has_all_names
in the validation report, or if that is already possible, provide an example in the package README.Alternatives Considered
I'm currently validating the existence of the required columns prior to using
data.validator
, and providing user feedback on the column existence viashiny::showNotification()
The text was updated successfully, but these errors were encountered: