-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dplyr 0.6.0 join problem with CRAN version of sparklyr 0.5.5 #2825
Comments
Thanks for reporting this @JohnMount, really appreciated. The problem here is that in order to support joins in I think the best path here is to push a patch for @JohnMount if you could try out this @hadley could you ping me on Slack when you submit |
Thanks @javierluraschi , It looks like suppressPackageStartupMessages(library('dplyr'))
library('sparklyr')
sc <- spark_connect(version='2.0.2',
master = "local")
d1 <- copy_to(sc, data.frame(x=1:3, y=4:6), 'd1')
d2 <- copy_to(sc, data.frame(x=1:3, y=7:9), 'd2')
left_join(d1, d2, by='x')
#> Error: Column `y` must have a unique name
# print versions
packageVersion("dplyr")
#> [1] '0.7.0'
packageVersion("sparklyr")
#> [1] '0.5.5'
if(requireNamespace("dbplyr", quietly = TRUE)) {
packageVersion("dbplyr")
}
#> [1] '1.0.0'
R.Version()$version.string
#> [1] "R version 3.4.0 (2017-04-21)"
# cleanup
spark_disconnect(sc)
suppressPackageStartupMessages(library('dplyr'))
library('sparklyr')
sc <- spark_connect(version='2.0.2',
master = "local")
d1 <- copy_to(sc, data.frame(x=1:3, y=4:6), 'd1')
d2 <- copy_to(sc, data.frame(x=1:3, y=7:9), 'd2')
left_join(d1, d2, by='x')
#> # Source: lazy query [?? x 3]
#> # Database: spark_connection
#> x y.x y.y
#> <int> <int> <int>
#> 1 1 4 7
#> 2 2 5 8
#> 3 3 6 9
# print versions
packageVersion("dplyr")
#> [1] '0.7.0'
packageVersion("sparklyr")
#> [1] '0.5.5.9002'
if(requireNamespace("dbplyr", quietly = TRUE)) {
packageVersion("dbplyr")
}
#> [1] '0.0.0.9001'
R.Version()$version.string
#> [1] "R version 3.4.0 (2017-04-21)"
# cleanup
spark_disconnect(sc) We can probably ask people to "go to the dev version of Sparklyr", but for confidence it would be good to have some assurance that a given tag or branch is stable and exactly what versions of everything is needed. Hopefully CRAN will let you push a |
|
@JohnMount on CRAN now. |
The current (5-28-2017) dev version of
dplyr
0.6.0 appears to not allow joins with common column names with the current CRAN version ofsparklyr
0.5.5. This means if this version ofdplyr
becomes current on CRAN beforesparklyr
also updates on CRAN, then production user code will break on bulk update (such asupdate.packages()
). As asparklyr
user I would suggest this be treated as an important dependent package (sparklyr
) breaking ondplyr
proposed CRAN update (regardless of the automatic check status ofsparklyr
0.5.5).The problem appears to go away if we move up to the dev version of
sparklyr
0.5.5.9000.I am re-filing the issue as I have improved the reprexes, and tested and documented more combinations of package versions. I am re-filing it here as this issue seems relevant to
dplyr
itself (especially assparklyr
appears to already have a fix that just needs to percolate up to CRAN).Failing and succeeding reprexes below.
The text was updated successfully, but these errors were encountered: