Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_wide should not use dcast(fun.aggregate=identity) #18

Open
tdhock opened this issue Mar 29, 2024 · 2 comments
Open

to_wide should not use dcast(fun.aggregate=identity) #18

tdhock opened this issue Mar 29, 2024 · 2 comments

Comments

@tdhock
Copy link

tdhock commented Mar 29, 2024

Hi @gdemin
Recently data.table merged some new dcast code which uses more strict checking of fun.aggregate, which is supposed to be a function which returns a single value. This is documented on ?dcast: " The aggregating function should take a vector as input and return a single value (or a list of length one) as output."

Using new data.table from github master, we ran example("to_wide") which gave the following:

> iris %>%
+     to_long(list(Sepal = cols("^Sepal"), Petal = cols("^Petal"))) %>%
+     let(
+         variable = factor(variable, levels = 1:2, labels = c("Length", "Width"))
+     ) %>%
+     to_wide(values_in = c(Sepal, Petal))
Error: Aggregating function(s) should take vector inputs and return a single value (length=1). However, function(s) returns length!=1. This value will have to be used to fill any missing combinations, and therefore must be length=1. Either override by setting the 'fill' argument explicitly or modify your function to handle this case appropriately.

Details: Rdatatable/data.table#6032

It seems that to_wide is calling dcast with fun.aggregate=identity which is problematic because that returns a vector with length>1. Can you please modify your code so that it uses a fun.aggregate that returns a single value? (length=1)

Thanks in advance!

@gdemin
Copy link
Owner

gdemin commented Apr 6, 2024

Hi @tdhock,

Thank you, I will fix it in the near future.
I think I will change the test and leave identity as default. This is because I mostly use to_wide for conversion from long form. In this case for each combination there should be only one value. And if there are more values then something is going wrong and an error message is absolutely appropriate.

Regards, Gregory

@tdhock
Copy link
Author

tdhock commented Apr 8, 2024

great, thanks, that sounds reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants