Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should as_duckplyr_df() work with tibbles from readr::read_csv()? #127

Open
andreranza opened this issue Mar 15, 2024 · 2 comments
Open
Milestone

Comments

@andreranza
Copy link
Contributor

suppressPackageStartupMessages(library(duckplyr))

df1 <- tibble::tibble(col = "A")
temp_file <- tempfile(fileext = ".csv")
readr::write_csv(df1, temp_file)

# tibble from a csv
df_duck_tib <- duckplyr_df_from_file( 
  temp_file,
  table_function = "read_csv_auto",
  class = class(tibble::tibble())
)
class(df_duck_tib)
#> [1] "duckplyr_df" "tbl_df"      "tbl"         "data.frame"

# or, data.frame from csv:
df_duck <- duckplyr_df_from_file(temp_file, table_function = "read_csv_auto")
class(df_duck)
#> [1] "duckplyr_df" "data.frame"

# however, fails due to `spec_tbl_df` attached by readr
spec_tbl_df <- readr::read_csv(temp_file, show_col_types = FALSE)
stopifnot("spec_tbl_df" %in% class(spec_tbl_df))
try(as_duckplyr_df(spec_tbl_df)) 
#> Error in as_duckplyr_df(spec_tbl_df) : 
#>   Must pass a plain data frame or a tibble to `as_duckplyr_df()`.

# stripping away `spec_tbl_df`
class(spec_tbl_df) <- c("tbl_df", "tbl", "data.frame")
as_duckplyr_df(spec_tbl_df)
#> # A tibble: 1 × 1
#>   col  
#>   <chr>
#> 1 A

Created on 2024-03-15 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31)
#>  os       macOS Sonoma 14.3.1
#>  system   x86_64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Rome
#>  date     2024-03-15
#>  pandoc   3.1.8 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  bit           4.0.5   2022-11-15 [1] CRAN (R 4.3.0)
#>  bit64         4.0.5   2020-08-30 [1] CRAN (R 4.3.0)
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.0)
#>  collections   0.3.7   2023-01-05 [1] CRAN (R 4.3.0)
#>  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.3.0)
#>  DBI           1.2.2   2024-02-16 [1] CRAN (R 4.3.2)
#>  digest        0.6.34  2024-01-11 [1] CRAN (R 4.3.0)
#>  dplyr         1.1.4   2023-11-17 [1] CRAN (R 4.3.0)
#>  duckdb        0.9.2-1 2023-11-28 [1] CRAN (R 4.3.0)
#>  duckplyr    * 0.3.1   2024-03-10 [1] CRAN (R 4.3.2)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.0)
#>  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.0)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.0)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.0)
#>  hms           1.1.3   2023-03-21 [1] CRAN (R 4.3.0)
#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.0)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.0)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.0)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.0)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo          1.26.0  2024-01-24 [1] CRAN (R 4.3.2)
#>  R.utils       2.12.3  2023-11-18 [1] CRAN (R 4.3.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
#>  readr         2.1.5   2024-01-10 [1] CRAN (R 4.3.0)
#>  reprex        2.1.0   2024-01-11 [1] CRAN (R 4.3.0)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.0)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.0)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
#>  styler        1.10.2  2023-08-29 [1] CRAN (R 4.3.0)
#>  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
#>  tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.3.0)
#>  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.0)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.0)
#>  vroom         1.6.5   2023-12-05 [1] CRAN (R 4.3.0)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.0)
#>  xfun          0.42    2024-02-08 [1] CRAN (R 4.3.2)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.0)
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@krlmlr
Copy link
Collaborator

krlmlr commented Mar 15, 2024

Not sure, but we can make the error mesage nicer, mentioning that the user might need calling as_tibble() or as.data.frame() .

@krlmlr krlmlr added this to the 0.4.0 milestone Mar 17, 2024
@nikostr
Copy link

nikostr commented Apr 22, 2024

I noticed a similar issue for grouped data frames. Maybe it would make sense to also put the current class in error message? Something like

Expected class "data.frame" or class "tbl_df" "tbl" "data.frame" but got class "grouped_df" "tbl_df" "tbl" "data.frame"

or some nicer variation of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants