-
rec()
now keeps the labels of the old values with the new ones when there are no labels specified and there is a 1 to 1 correspondence between old and new values. -
format()
forfrq()
was revised, and now allows to format the frequency table to prepare for printing in text, markdown and HTML-format. To do so, use the methodsprint()
,print_md()
orprint_html()
. -
Address changes in forthcoming update of sjstats.
- Fix CRAN check issues.
- A first draft of
format()
forfrq()
was implemented.
-
merge_df()
preserves more attributes related to labelled data. -
to.factor
is an alias for the argumentas.num
.
- Updating imports.
- Fixed bug in
move_columns()
(using a variable as value for argument.after
didn't work).
flat_table()
gains aweights
-argument.
descr()
calculated wrong percentage of missing values for weighted data.- Fixed issue in
rec()
whenmin
,max
,lo
orhi
was used to recode a numeric into a character vector, and the new recode string contained one of these four strings as pattern. - Give informative warning in
rec()
whenmax
orhi
was used to recode a value which maximum values was lower than a defined range, e.g.4:max
when the maximum values was lower than 4.
descr()
now also calculates the IQR.- Revised
print()
-method forfrq()
. - Minor changes to be compatible with forthcoming dplyr-release.
- Added
as.data.frame()
forfrq()
.
typical_value()
now returns the median for integer-values (instead of mean), to preserve the integer-type of a variable.- The recode-pattern in
rec()
now also works for character variables with whitespaces. rec()
now warns explicitely for possible non-intended multiple assignment of identical new recode-values.- Improved printing for
frq()
. merge_imputations()
now returns the plot-object as well.to_numeric()
as alias forto_value()
.
- Fixed warning from CRAN checks.
- Fixed errors from CRAN checks.
- Alias
find_variables()
(alias forfind_var()
) was renamed tofind_in_data()
, to avoid conflicts with package insight. rename_variables()
andrename_columns()
are aliases forvar_rename()
.
frq()
now also prints frequencies of logical conditions, e.g. how many values are lower or greater than a certain threshold.frq()
gets amin.frq
-argument, indicating the minimum frequency for which a value will be shown in the output.descr()
gets ashow
argument to show selected columns only.descr()
gets afile
-argument to write the output as HTML file.var_rename()
now also accepts a named vector with multiple elements as ellipses-argument.
- Fixed erroneously warning in
de_mean()
. merge_df()
now removes columns with identical column names inside a data frame before merging, to avoid errors.- Fixed issue when printing character vectors in
frq()
, where first element was empty, and vectors were not provided as data frame argument. - Fixed issue in
word_wrap()
when processing expressions. - Fixed issue in
rec()
with tokenrec = "rev"
, when reversing labelled vectors with more value labels than values.
find_variables()
as alias forfind_var()
.- Revised docs.
- Fixed issue with forthcoming update of the rlang package.
- Some print-methods, especially for grouped data frames, are now more compact.
reshape_longer()
, as alternative toto_long()
, probably easier to remember (function and argument-names).
frq()
displayed labels asNA
in some situations for grouped data frames with more than one group, when data were not labelled.
- Reduce package dependencies.
str_pos()
was renamed intostr_find()
.- New package-vignette Recoding Variables.
typical_value()
, which was formerly located in package sjstats.
is_whole()
now automatically removes missing values from vectors.is_empty()
now also checks lists with onlyNULL
-elements.
- Better handling of factors in
merge_imputations()
, which previously could result inNA
-values when merging imputed values into one variable. - Fix issue in
is_empty()
in case the vector had non-missing values, but first element of vector wasNA
. - Fixed bug in
frq()
for grouped data frame, when grouping variable was a character vector. In this case, group titles were mixed up. - Fix encoding issues in help-files.
tidy_values()
to "clean" values (i.e. remove special chars) of character vectors or levels of factors.add_id()
to quickly add an ID variable to (grouped) data frames.
frq()
gets ashow.na
-argument, to (automatically) show or hide the information forNA
-values from the output.- The
weights
-argument infrq()
now also accepts vectors, and is not limited to variable names. Note that these vectors must be part of a data frame. - For recode-functions (like
rec()
,dicho()
, ...), ifsuffix = ""
andappend = TRUE
, existing variables will be replaced by the new, recoded variables. - Improved performance for
group_str()
. var_rename()
now supports quasi-quotation (see Examples).row_sums()
androw_means()
now return the input data frame when this data frame only had one column and no row means or sums were calculated. The returned data frame still gets the new variable name defined invar
.
complete_cases()
returned an empty vector instead of all indexes if all cases (rows) of a data frame were complete.- Fix issue with
to_dummy()
for character-vector input. - Fix issue with missing values in
group_str()
. - Fix issue with grouped data frames in
frq()
whengrp.strings = TRUE
.
frq()
gets afile
andencoding
argument, to save the HTML output as file.add_variables()
andmove_columns()
now preserve the attributes of a data frame.
- Reduce package dependencies.
de_mean()
to compute group-meaned and de-meaned variables.add_variables()
andadd_case()
to add columns or rows in a convenient way to a data frame.move_columns()
to move one or more columns to another position in a data frame.is_num_chr()
to check whether a character vector has only numeric strings.seq_col()
andseq_row()
as convenient wrapper to create a regular sequence for column or row numbers.
descr()
gets aweights
-argument, to print weighted descriptive statistics.- The
n
-argument inrow_means()
androw_sums()
now also may beInf
, to compute means or sums only if all values in a row are valid (i.e. non-missing). - Argument
weight.by
infrq()
was renamed intoweights
. frq()
gets atitle
-argument, to specify an alternative title to the variable label.
round_num()
preserves data frame attributes.frq()
printed frequencies of grouping-variable for grouped data frames, whenweights
was notNULL
.- Fixed issue with wrong title in
frq()
for grouped data frames, when grouping variable was an unlabelled factor.
has_na()
to check if variables or observations in a data frame containNA
,NaN
orInf
values. Convenient shortcuts for this function arecomplete_cases()
,incomplete_cases()
,complete_vars()
andincomplete_vars()
.total_mean()
to compute the overall mean of all values from all columns in a data frame.prcn()
to convert numeric scalars between 0 and 1 into a character-percentage value.numeric_to_factor()
to convert numeric variables into factors, using associated value labels as factor levels.
set_na()
now also replaces different values per variable intoNA
.- Changed behaviour of
row_sums()
and missing values.row_sums()
gets an
-argument and now computes row sums if a row has at leastn
non-missing values.
- A test-suite was added to the package.
- Updated reference in
CITATION
to the publication in the Journal of Open Source Software.
is_cross_classified()
to check whether two factors are partially crossed.
ref_lvl()
now also accepts value labels as value for thelvl
-argument. Additionally,ref_lvl()
now also works for factor with non-numeric factor levels and simply returnsrelevel(x, ref = lvl)
in such cases.
- Fixed encoding issues in
rec()
with direct labelling for certain locales. - Fixed issue in
count_na()
, which did not print labels of taggedNA
values since the last revision offrq()
. - Fixed issue in
merge_imputation()
for cases where original data frame had less columns than imputed data frames. - Fixed issue in
find_var()
for fuzzy-matching in all elements (i.e. whenfuzzy = TRUE
andsearch = "all"
).
round_num()
to round only numeric values in a data frame.
- Improved performance for
merge_df()
. Furthermore,add_rows()
was added as alias formerge_df()
. merge_df()
resp.add_rows()
now create a uniqueid
-name instead of dropping the ID-variable, in caseid
has the same name of any existing variables in the provided data frames.- Improved performance for
descr()
and minor changes to the output.
Following functions now also work on mids
-objects, as returned by the mice()
-function:
row_count()
,row_sums()
,row_means()
,rec()
,dicho()
,center()
,std()
,recode_to()
andto_long()
.
- The
weight.by
-argument infrq()
now should be a variable name from a variable inx
, and no longer a separate vector.
descr()
does not work with character vectors, so these are being removed now.
- Fix typos and revise outdated paragraphs in vignettes.
The recoding and transformation functions get scoped variants, allowing to select variables based on logical conditions described in a function:
rec_if()
as scoped variant ofrec()
.dicho_if()
as scoped variant ofdicho()
.center_if()
as scoped variant ofcenter()
.std_if()
as scoped variant ofstd()
.split_var_if()
as scoped variant ofsplit_var()
.group_var_if()
andgroup_label_if()
as scoped variant ofgroup_var()
andgroup_label()
.recode_to_if()
as scoped variant ofrecode_to()
.set_na_if()
as scoped variant ofset_na()
.
- New function
remove_cols()
as alias forremove_var()
. std()
gets a new robust-option,robust = "2sd"
, which divides the centered variables by two standard deviations.- Slightly improved performance for
set_na()
.
frq()
now removes empty columns before computing frequencies, because applyingfrq()
on empty vectors caused an error.empty_cols()
andempty_rows()
(and hence,remove_empty_cols()
andremove_empty_rows()
) caused an error for data frames with only one column resp. row, or ifx
was a vector and no data frame.frq()
now removes missing values from input when weights are applied, to ensure that input and weights have same length.
- Breaking changes: The
append
-argument in recode and transformation functions likerec()
,dicho()
,split_var()
,group_var()
,center()
,std()
,recode_to()
,row_sums()
,row_count()
,col_count()
androw_means()
now defaults toTRUE
. - The
print()
-method fordescr()
now accepts adigits
-argument, to specify the rounding of the output. - Cross refences from
dplyr::select_helpers
were updated totidyselect::select_helpers
.
is_whole()
as counterpart tois_float()
.
frq()
now prints variable names for non-labelled data, adds variable names in braces for labelled data and omits the label column for non-labelled data.frq()
now prints mean and standard deviation in the header line of the output.frq()
now gets aauto.grp
-argument to automatically group variables with many unique values.frq()
now gets ashow.strings
-argument to omit string variables (character vectors) from being printed as frequency table.frq()
now gets agrp.strings
-argument to group similar string values in the frequency table.frq()
gets anout
-argument, to print output to console, or as HTML table in the viewer or web browser.descr()
gets anout
-argument, to print output to console, or as HTML table in the viewer or web browser.
is_empty()
returnedTRUE
for single vectors withNA
being the first element.- Fix issue where due to a bug during code cleanup,
remove_empty_rows()
did no longer remove empty rows, but columns.
- Revised examples that used removed methods from other packages.
- Use select-helpers from package tidyselect, instead of dplyr.
- Beautiful colored output for
frq()
,descr()
andflat_table()
.
rec()
now also recodes doubles with floating points, if a range of values is specified.std()
andcenter()
now useinclude.fac = FALSE
as default option.std()
gets arobust
-argument, to divide variables either by standard deviation, or - in case of asymmetrically distributed variables - median absolute deviation or Gini's mean difference.frq()
now shows total and valid N in output.
center()
,std()
,dicho()
,split_var()
andgroup_var()
did not work correctly for grouped data frames.frq()
did not print multiple variables when applied on grouped data frames.
- Arguments
as.df
andas.varlab
in functionfind_var()
are now deprecated. Please useout
instead. rotate_df()
preserves attributes.is_float()
is now exported as function.
- Fixed bug for
to_label()
, whenx
was a character vector and argumentdrop.levels
wasTRUE
.
- Fixed issue with latest tidyr-update on CRAN.
frq()
did not correctly calculate valid and cumulative percentages when using weights.
- All labelled-data functions were removed and are now in package sjlabelled.
remove_var()
as pipe-friendly function to remove variables from data frames.var_type()
as pipe-friendly function to determine the type of variables.all_na()
to check whether a vector only consists of NA values.rotate_df()
to rotate data frames (switch columns and rows).shorten_string()
, to shorten strings to a certain maxium number of chars.
- Following functions now also work on grouped data frames:
dicho()
,split_var()
,group_var()
,std()
andcenter()
. - Argument
groupcount
insplit_var()
,group_var()
andgroup_labels()
is now namedn
. - Argument
groupsize
ingroup_var()
andgroup_labels()
is now namedsize
. frq()
gets a revised print-method, which does not print the result to console when captured in an object (i.e.,x <- frq(x)
no longer prints the result).frq()
no longer prints (redundant) labels for factors w/o value label attributes.frq()
adds information about the variable type in the table caption (only for variables with variable labels).frq()
adds information about groups when printing grouped, non-labelled variables.descr()
now also prints information about the variable type.to_character()
now preserves variable labels.
- sjmisc now uses dplyr's tidyeval-approach to evaluate arguments. This means that the select-helper-functions (like
one_of()
orcontains()
) no longer need to be prefixed with a~
when used as argument within sjmisc-functions. - All labelled-data functions are now deprecated and will become defunct in future package versions. The labelled-data functions have been moved into a separate package, sjlabelled.
row_count()
to count specific values in a data frame per observation.col_count()
to count specific values in a data frame per variable.str_start()
andstr_end()
to find starting and end indices of patterns inside strings.
- The output for
frq()
now always includes aNA
-row, but no longer prints a value for theNA
-row. merge_imputations()
gets asummary
-argument to plot a graphical summary of the quality of the merging process.
add_columns()
andreplace_columns()
crashed R when no data frame was specified in...
-ellipses argument.descr()
andfrq()
used wrong variable labels when processing grouped data frames for specific situations, where the grouping variable had no sequences values.descr()
did not work for large data frames, because internally, becausepsych::describe()
switched to fast mode by default then (removing columns from the output).