Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

descr does not calculate statistics (e.g. min, max) correctly if the column names contain exactly the same postfixes as the statistics function string (e.g. "column_min" or "column_max") #152

Open
yenchiayi opened this issue Nov 4, 2021 · 0 comments

Comments

@yenchiayi
Copy link

I have a small data.frame with dimension = (2, 3) as follows:

column0 column1 column2
1 11 21
2 12 22

The descr function calculates everything correctly if I set column names as c("x", "x_1", "x_2"):

df <- data.frame(
  x = 1:2,
  x_1 = 11:12,
  x_2 = 21:22
)
df %>% 
  summarytools::descr(stats = c( "min", "max", "n.valid", "skewness", "kurtosis")) 
x x_1 x_2
Min 1.00 11.00 21.00
Max 2.00 12.00 22.00
N.Valid 2.00 2.00 2.00
Skewness 0.00 0.00 0.00
Kurtosis -2.75 -2.75 -2.75

However, if I set column names as c("x", "x_min", "x_max"), then descr does not calculate minimum and maximum (as well as other statistics like "n.valid", "skewness", and "kurtosis" ) correctly.

df <- data.frame(
  x = 1:2,
  x_min = 11:12,
  x_max = 21:22
)
df %>% 
  summarytools::descr(stats = c( "min", "max", "n.valid", "skewness", "kurtosis"))

As seen in below output, the Min of column 2 (x_max) is even larger than its Max. Other statistics like N.Valid, "Skewness", and "Kurtosis" are also wrong for the column "x_max" and "x_min".

x x_max x_min
Min 1.00 21 1
Max 2.00 2 1
N.Valid 2.00 1 1
Skewness 0.00 NA NA
Kurtosis -2.75 NA NA

My preliminary guess is that the the program may fail to distinguish the column name postfix (e.g. x_min) and the function name (e.g. min). I found that this issue arises around line 367-373 In descr.R. You may check this and see what happens.

image

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants