Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to be stricter about column compatibility in joins #168

Open
krlmlr opened this issue May 16, 2024 · 3 comments
Open

Need to be stricter about column compatibility in joins #168

krlmlr opened this issue May 16, 2024 · 3 comments

Comments

@krlmlr
Copy link
Collaborator

krlmlr commented May 16, 2024

Also, use simple identity instead of r_base::== for joins.

dplyr::union_all(data.frame(a = 1), data.frame(a = "2"))
#> Error in `dplyr::union_all()`:
#> ! `x` and `y` are not compatible.
#> ✖ Incompatible types for column `a`: double vs character.
dplyr::setdiff(data.frame(a = 1), data.frame(a = "2"))
#> Error in `dplyr::setdiff()`:
#> ! `x` and `y` are not compatible.
#> ✖ Incompatible types for column `a`: double vs character.
dplyr::left_join(data.frame(a = 1), data.frame(a = "2"), by = "a")
#> Error in `dplyr::left_join()`:
#> ! Can't join `x$a` with `y$a` due to incompatible types.
#> ℹ `x$a` is a <double>.
#> ℹ `y$a` is a <character>.
dplyr::semi_join(data.frame(a = 1), data.frame(a = "2"), by = "a")
#> Error in `dplyr::semi_join()`:
#> ! Can't join `x$a` with `y$a` due to incompatible types.
#> ℹ `x$a` is a <double>.
#> ℹ `y$a` is a <character>.

duckplyr:::duckplyr_union_all(data.frame(a = 1), data.frame(a = "2"))
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Union
#>   r_dataframe_scan(0x11d52b258)  r_dataframe_scan(0x11d0c4e38)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (VARCHAR)
#> 
#>     a
#> 1 1.0
#> 2   2
duckplyr:::duckplyr_setdiff(data.frame(a = 1), data.frame(a = "2"))
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Except
#>   r_dataframe_scan(0x11d556c70)  r_dataframe_scan(0x11d55aea0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (VARCHAR)
#> 
#>     a
#> 1 1.0
duckplyr:::duckplyr_left_join(data.frame(a = 1), data.frame(a = "2"), by = "a")
#> Error in `left_join()` at duckplyr/R/left_join.R:19:3:
#> ! Can't join `x$a` with `y$a` due to incompatible types.
#> ℹ `x$a` is a <double>.
#> ℹ `y$a` is a <character>.
duckplyr:::duckplyr_semi_join(data.frame(a = 1), data.frame(a = "2"), by = "a")
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Join REGULAR SEMI ___eq_na_matches_na(lhs.a, rhs.a)
#>   r_dataframe_scan(0x14aace6d0)
#>   r_dataframe_scan(0x13b892678)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> 
#> [1] a
#> <0 rows> (or 0-length row.names)

Created on 2024-05-16 with reprex v2.1.0

@nicki-dese

This comment was marked as off-topic.

@krlmlr

This comment was marked as off-topic.

@krlmlr
Copy link
Collaborator Author

krlmlr commented May 20, 2024

Set operations are good now, joins need tidyverse/dplyr#7029.

@krlmlr krlmlr changed the title Need to be stricter about column compatibility Need to be stricter about column compatibility in joins May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants