Skip to content

Releases: strengejacke/sjmisc

sjmisc 2.7.5

13 Sep 18:21
Compare
Choose a tag to compare

General

  • Reduce package dependencies.

New functions

  • de_mean() to compute group-meaned and de-meaned variables.
  • add_variables() and add_case() to add columns or rows in a convenient way to a data frame.
  • move_columns() to move one or more columns to another position in a data frame.
  • is_num_chr() to check whether a character vector has only numeric strings.
  • seq_col() and seq_row() as convenient wrapper to create a regular sequence for column or row numbers.

Changes to functions

  • descr() gets a weights-argument, to print weighted descriptive statistics.
  • The n-argument in row_means() and row_sums() now also may be Inf, to compute means or sums only if all values in a row are valid (i.e. non-missing).
  • Argument weight.by in frq() was renamed into weights.
  • frq() gets a title-argument, to specify an alternative title to the variable label.

Bug fixes

  • round_num() preserves data frame attributes.
  • frq() printed frequencies of grouping-variable for grouped data frames, when weights was not NULL.
  • Fixed issue with wrong title in frq() for grouped data frames, when grouping variable was an unlabelled factor.

sjmisc 2.7.4

05 Aug 06:31
Compare
Choose a tag to compare

New functions

  • has_na() to check if variables or observations in a data frame contain NA, NaN or Inf values. Convenient shortcuts for this function are complete_cases(), incomplete_cases(), complete_vars() and incomplete_vars().
  • total_mean() to compute the overall mean of all values from all columns in a data frame.
  • prcn() to convert numeric scalars between 0 and 1 into a character-percentage value.
  • numeric_to_factor() to convert numeric variables into factors, using associated value labels as factor levels.

Changes to functions

  • set_na() now also replaces different values per variable into NA.
  • Changed behaviour of row_sums() and missing values. row_sums() gets a n-argument and now computes row sums if a row has at least n non-missing values.

sjmisc 2.7.3

20 Jun 18:05
Compare
Choose a tag to compare

General

  • A test-suite was added to the package.
  • Updated reference in CITATION to the publication in the Journal of Open Source Software.

New functions

  • is_cross_classified() to check whether two factors are partially crossed.

Changes to functions

  • ref_lvl() now also accepts value labels as value for the lvl-argument. Additionally, ref_lvl() now also works for factor with non-numeric factor levels and simply returns relevel(x, ref = lvl) in such cases.

Bug fixes

  • Fixed encoding issues in rec() with direct labelling for certain locales.
  • Fixed issue in count_na(), which did not print labels of tagged NA values since the last revision of frq().
  • Fixed issue in merge_imputation() for cases where original data frame had less columns than imputed data frames.
  • Fixed issue in find_var() for fuzzy-matching in all elements (i.e. when fuzzy = TRUE and search = "all").

sjmisc 2.7.2-1 (JOSS)

20 Jun 08:08
Compare
Choose a tag to compare

Revised version 2.7.2, which is published in JOSS.

sjmisc 2.7.2

18 May 07:49
Compare
Choose a tag to compare

New functions

  • round_num() to round only numeric values in a data frame.

General

  • Improved performance for merge_df(). Furthermore, add_rows() was added as alias for merge_df().
  • merge_df() resp. add_rows() now create a unique id-name instead of dropping the ID-variable, in case id has the same name of any existing variables in the provided data frames.
  • Improved performance for descr() and minor changes to the output.

Support for mids-objects (package mice)

Following functions now also work on mids-objects, as returned by the mice()-function:

  • row_count(), row_sums(), row_means(), rec(), dicho(), center(), std(), recode_to() and to_long().

Changes to functions

  • The weight.by-argument in frq() now should be a variable name from a variable in x, and no longer a separate vector.

Bug fixes

  • descr() does not work with character vectors, so these are being removed now.

sjmisc 2.4.0

07 Apr 08:54
Compare
Choose a tag to compare

General

  • Argument value in set_na() is deprecated. Please use na instead.
  • Argument recodes in rec() is deprecated. Please use rec instead.
  • Argument lab in set_label() is deprecated. Please use label instead.
  • Argument value in add_labels() and replace_labels() is deprecated. Please use labels instead.
  • Argument value in ref_lvl() is deprecated. Please use lvl instead.

New functions

  • row_sums() as wrapper of rowSums() to compute row sums, but works within pipe-workflow and with select-helpers for variables, and always returns a tibble..
  • row_means() as wrapper of sjstats::mean_n() to compute row means, but works within pipe-workflow and with select-helpers for variables, and always returns a tibble..
  • %nin% as complement to %in%.

Changes to functions

  • Functions rec(), dicho(), center(), std(), recode_to() and group_var() get an append-argument, to optionally return the original data including the transformed variables as new columns.
  • var_labels() and var_rename() now give a warning if specified variables to rename or relabel do not exist in the data frame. Non-matching variables are ignored.
  • If model.term does not exist in models, spread_coef() now prints the name of non-existing coefficients.
  • find_var() gets a fuzzy-argument to enable fuzzy-matching for search pattern.

Bug fixes

  • remove_empty_cols() returned an empty data frame, when input data frame had no empty columns.
  • remove_empty_rows() returned an empty data frame, when input data frame had no empty rows.
  • add_columns() and replace_columns() in some cases coerced data frames of class data.frame with only one column into a vector, which gave an error when binding columns.
  • Argument part.dist.match in str_pos() caused an error when being larger than 0.

sjmisc 2.3.1

07 Mar 07:11
Compare
Choose a tag to compare

General

  • Re-exports magrittr::%>% (Bob Rudis said more packages should do so).

New functions

  • replace_columns() to replace variables in one data frame with variables from other data frames.

Changes to functions

  • descr() gets a max.length-argument to shorten variable labels in the output to a specific number of chars.
  • descr() now also reports the percentage of missing values.
  • set_na() no longer gives a warning when trying to replace values with NA for vectors that completely contained NAs.

Bug fixes

  • descr() now also works on single vectors as data argument.
  • Fixed bugs with write_*()-functions.

sjmisc 2.3.0

08 Feb 06:51
Compare
Choose a tag to compare

General

  • Added package-vignettes.
  • Functions were largely revised to work seamlessly within the tidyverse. This may break existing code, but in the long run, consistent api-design makes working with the package more intuitive. See vignette("design_philosophy", "sjmisc") for more details.
  • as_labelled() only converts vectors into labelled-class if vector has label attributes. This ensures that data can be properly saved into other formats, e.g. with write_spss().
  • The write_*()-functions have been revised and should now save data frame with any common classes of vectors (labelled, factor, numeric, atomic...).

New functions

  • center() and std() are moving from package sjstats to sjmisc.
  • add_columns() to bind columns of first data frame at the end of all data frames.

Changes to functions

  • frq() now has the same argument-structure as flat_table().
  • Following functions now follow a consistent tidyverse-approach, with the data being the first argument, followed by variable names: add_labels(), replace_labels(), remove_labels(), count_na(), rec(), dicho(), split_var(), drop_labels(), fill_labels(), group_var(), group_labels(), ref_lvl(), recode_to(), replace_na(), set_na() and set_labels().
  • get_values() now sorts returned values by default, to be consistent with get_labels().
  • spread_coef() gets arguments se and p.val, to define whether standard errors and p-values should be included in the return value or not.

Bug fixes

  • merge_df() did not copy label attributes for data frame with identical variables (that were row-bound).
  • to_value() did not work for character vectors of class labelled, with empty values (which typically have no value label).

sjmisc 2.2.1

09 Jan 20:07
Compare
Choose a tag to compare

Bug fixes

  • The sort.frq did not work for frq().

sjmisc 2.2.0

21 Dec 11:53
Compare
Choose a tag to compare

New functions

  • zap_inf() to "clean" vectors from NaN and infinite values.
  • descr() to provide basic descriptive statistics (similar to describe() in the psych-package), but including variable labels and usable in pipe-workflows. Also works with grouped data frames.

Changes to functions

  • rec(), split_var() and dicho() get an argument suffix, to append a suffix to variable (column) names, if applied on a data frame.
  • Value labels in rec() can now directly be assigned inside the recodes-syntax (see 'Details' in ?rec).
  • find_var() gets a as.df-argument, to return a data frame with matching variables, instead of their column indices only.
  • find_var() gets a as.varlab-argument, to return a "summary" data frame with column number, variable name and variable label.
  • flat_table() now also accepts grouped data frames.
  • flat_table() gets a show.values-argument, to add values to associated labels in output.
  • frq() now also accepts grouped data frames.
  • frq() gets a weight.by-argument to weight frequencies.
  • set_na() can now also find values by their value labels and replace them with NA.
  • set_na() now removes unused value labels from values that have been replaced with NA.
  • The as.tag-argument in set_na() now defaults to FALSE.
  • get_labels() now always returns labels in sorted order of the associated values.
  • get_labels() gets a drop.unused-argument, to automatically drop labels from values that don't occur in the vector.
  • For a named vector as labels-argument, set_labels() now always sorts labels in sorted order of the associated values.
  • is_empty() gets a first.only-argument, to evaluate either first or all elements of a character vector.

Bug fixes

  • set_na() did not work on vectors of class Date when argument as.tag = TRUE.
  • flat_table() did not show values that had no value labels. Now all categories are shown in the frequency table.
  • rec() did not properly copy labels of tagged NA values when not all recoded values appeared in the vector.
  • frq() did not show correct values, when value labels of a vector were not sorted according their values.
  • set_labels() did not set labels properly for ordered factors.
  • remove_labels() returned NA-values for value labels (instead of no value labels) when the last value label of a vector was removed.