Skip to content

Commit

Permalink
Fix #315 (#367)
Browse files Browse the repository at this point in the history
* Fix #315

* Bump ver.

* Make it possible to setclass arrow

* Make ArrowTabular exportable

* how about this?

* skip the tests for 3.6
  • Loading branch information
chainsawriot committed Sep 13, 2023
1 parent c50f10e commit 114a735
Show file tree
Hide file tree
Showing 13 changed files with 39 additions and 17 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: rio
Type: Package
Title: A Swiss-Army Knife for Data I/O
Version: 0.5.32
Version: 0.5.33
Authors@R: c(person("Jason", "Becker", role = "aut", email = "jason@jbecker.co"),
person("Chung-hong", "Chan", role = c("aut", "cre"), email = "chainsawtiney@gmail.com",
comment = c(ORCID = "0000-0002-6232-7530")),
Expand Down Expand Up @@ -49,6 +49,7 @@ Imports:
curl (>= 0.6),
data.table (>= 1.11.2),
readxl (>= 0.1.1),
arrow (>= 0.17.0),
tibble,
stringi,
writexl,
Expand All @@ -60,7 +61,6 @@ Suggests:
testthat,
knitr,
magrittr,
arrow (>= 0.17.0),
clipr,
fst,
hexView,
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
* `get_info` is added #350
* POTENTIALLY BREAKING: `setclass` parameter is now authoritative. Therefore: `import("starwars.csv", data.table = TRUE, setclass = "tibble")` will return a tibble (unlike previous versions where a data.table is returned). The default class is data frame. You can either explicitly use the `setclass` parameter; or set the option: `options(rio.import.class = "data.table")`. h/t David Schoch #336
* Use `writexl` instead of `openxlsx`. Option to read xlsx with `openxlsx` (i.e. `import("starwars.xlsx", readxl = FALSE)`) is always `TRUE`. The ability to overwrite an existing sheet in an existing xlsx file is also removed. It is against the design principle of `rio`.
* Parquet and feather are now formats supported out of the box; Possible to setclass to `arrow` / `arrow_table`; ArrowTabular class can be exported #315
* Add "extension" vignette
* POTENTIALLY BREAKING: The following options are deprecated: `import(fread)`, `import(readr = TRUE)`, `import(haven)`, `import(readxl)` and `export(fwrite)`. import will almost use `data.table`, `haven`, `readxl`, and internal function (for fwf) to import and export data. Currently, those options stay for backward compatibility but will be removed in v2.0.0. #343 h/t David Schoch
* POTENTIALLY BREAKING: `...` is handled differently. Underlying functions using "Tidy" convention (e.g. `readxl::read_xlsx()`) can use "Base Convention" (See the new vignette: `remap`). Unused arguments passed to the underlying function as `...` are silently ignored by default. A new option `rio.ignoreunusedargs` is added to control this behavior. #326
Expand Down
2 changes: 1 addition & 1 deletion R/export.R
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ export <- function(x, file, format, ...) {
}
format <- .standardize_format(format)
outfile <- file
if (is.matrix(x)) {
if (is.matrix(x) || inherits(x, "ArrowTabular")) {
x <- as.data.frame(x)
}
if (!is.data.frame(x) && !format %in% c("xlsx", "html", "rdata", "rds", "json", "qs")) {
Expand Down
2 changes: 0 additions & 2 deletions R/export_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,6 @@ export_delim <- function(file, x, fwrite = lifecycle::deprecated(), sep = "\t",

#' @export
.export.rio_feather <- function(file, x, ...) {
.check_pkg_availability("arrow")
arrow::write_feather(x = x, sink = file, ...)
}

Expand Down Expand Up @@ -286,7 +285,6 @@ export_delim <- function(file, x, fwrite = lifecycle::deprecated(), sep = "\t",

#' @export
.export.rio_parquet <- function(file, x, ...) {
.check_pkg_availability("arrow")
arrow::write_parquet(x = x, sink = file, ...)
}

Expand Down
2 changes: 1 addition & 1 deletion R/import.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#' @param setclass An optional character vector specifying one or more classes
#' to set on the import. By default, the return object is always a
#' \dQuote{data.frame}. Allowed values include \dQuote{tbl_df}, \dQuote{tbl}, or
#' \dQuote{tibble} (if using dplyr) or \dQuote{data.table} (if using
#' \dQuote{tibble} (if using tibble), \dQuote{arrow}, \dQuote{arrow_table} (if using arrow table) or \dQuote{data.table} (if using
#' data.table). Other values are ignored, such that a data.frame is returned.
#' The parameter takes precedents over parameters in \dots which set a different class.
#' @param which This argument is used to control import from multi-object files; as a rule `import` only ever returns a single data frame (use [import_list()] to import multiple data frames from a multi-object file). If `file` is a compressed directory, `which` can be either a character string specifying a filename or an integer specifying which file (in locale sort order) to extract from the compressed directory. For Excel spreadsheets, this can be used to specify a sheet name or number. For .Rdata files, this can be an object name. For HTML files, it identifies which table to extract (from document order). Ignored otherwise. A character string value will be used as a regular expression, such that the extracted file is the first match of the regular expression against the file names in the archive.
Expand Down
2 changes: 0 additions & 2 deletions R/import_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,6 @@ import_delim <- function(file, which = 1, sep = "auto", header = "auto", strings

#' @export
.import.rio_feather <- function(file, which = 1, ...) {
.check_pkg_availability("feather")
arrow::read_feather(file = file, ...)
}

Expand Down Expand Up @@ -390,7 +389,6 @@ extract_html_row <- function(x, empty_value) {

#' @export
.import.rio_parquet <- function(file, which = 1, as_data_frame = TRUE, ...) {
.check_pkg_availability("arrow")
arrow::read_parquet(file = file, as_data_frame = TRUE, ...)
}

Expand Down
10 changes: 10 additions & 0 deletions R/set_class.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,19 @@ set_class <- function(x, class = NULL) {
return(.ensure_tibble(x))
}

if (any(c("arrow", "arrow_table") %in% class)) {
return(.ensure_arrow(x))
}
return(.ensure_data_frame(x))
}

.ensure_arrow <- function(x) {
if (inherits(x, "ArrowTabular")) {
return(x)
}
return(arrow::arrow_table(x))
}

.ensure_data_table <- function(x) {
if (inherits(x, "data.table")) {
return(x)
Expand Down
2 changes: 1 addition & 1 deletion man/import.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/import_list.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions tests/testthat/test_format_csv.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ test_that("Import from (European-style) CSV with semicolon separator", {
write.table(iris, "iris2.csv", dec = ",", sep = ";", row.names = FALSE)
expect_true("iris2.csv" %in% dir())
# import works (even if column classes are incorrect)
expect_true(is.data.frame(import("iris2.csv", fread = TRUE, header = TRUE)))
iris_imported <- import("iris2.csv", format = ";", fread = TRUE, header = TRUE)
expect_true(is.data.frame(import("iris2.csv", header = TRUE)))
iris_imported <- import("iris2.csv", format = ";", header = TRUE)
# import works with correct, numeric column classes
expect_true(is.data.frame(iris_imported))
expect_true(is.numeric(iris_imported[["Sepal.Length"]]))
Expand Down
3 changes: 0 additions & 3 deletions tests/testthat/test_format_feather.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,14 @@ context("feather imports/exports")
require("datasets")

test_that("Export to feather", {
skip_if_not_installed(pkg="feather")
expect_true(export(iris, "iris.feather") %in% dir())
})

test_that("Import from feather", {
skip_if_not_installed(pkg="feather")
expect_true(is.data.frame(import("iris.feather")))
})

test_that("... correctly passed, #318", {
skip_if_not_installed(pkg="feather")
## actually feather::write_feather has only two arguments (as of 2023-09-01)
## it is more for possible future expansion
expect_error(export(mtcars, "mtcars.feather", hello = 42))
Expand Down
1 change: 0 additions & 1 deletion tests/testthat/test_format_parquet.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ context("Parquet imports/exports")
require("datasets")

test_that("Export to and import from parquet", {
skip_if_not_installed("arrow")
expect_true(export(iris, "iris.parquet") %in% dir())
expect_true(is.data.frame(import("iris.parquet")))
unlink("iris.parquet")
Expand Down
21 changes: 20 additions & 1 deletion tests/testthat/test_set_class.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ test_that("Set object class", {
expect_true(inherits(set_class(mtcars), "data.frame"))
expect_true(inherits(set_class(mtcars_tibble), "data.frame"))
expect_true(inherits(set_class(mtcars_datatable), "data.frame"))

expect_true(inherits(set_class(mtcars, class = "fakeclass"), "data.frame"))
expect_true(!"fakeclass" %in% class(set_class(mtcars, class = "fakeclass")))
})
Expand All @@ -25,3 +24,23 @@ test_that("Set object class as data.table", {
expect_true(inherits(import("mtcars.csv", data.table = TRUE, setclass = "data.table"), "data.table"))
unlink("mtcars.csv")
})

test_that("Set object class as arrow table", {
skip_if(getRversion() <= "4.2")
mtcars_arrow <- arrow::arrow_table(mtcars)
expect_false(inherits(set_class(mtcars_arrow), "data.frame")) ## arrow table is not data.frame
expect_true(inherits(set_class(mtcars, class = "arrow"), "ArrowTabular"))
expect_true(inherits(set_class(mtcars, class = "arrow_table"), "ArrowTabular"))
export(mtcars, "mtcars.csv")
expect_true(inherits(import("mtcars.csv", setclass = "arrow"), "ArrowTabular"))
expect_true(inherits(import("mtcars.csv", data.table = TRUE, setclass = "arrow"), "ArrowTabular"))
unlink("mtcars.csv")
})

test_that("ArrowTabular can be exported", {
skip_if(getRversion() <= "4.2")
mtcars_arrow <- arrow::arrow_table(mtcars)
expect_error(export(mtcars_arrow, "mtcars.csv"), NA) ## no concept of rownames
expect_true(inherits(import("mtcars.csv"), "data.frame"))
unlink("mtcars.csv")
})

0 comments on commit 114a735

Please sign in to comment.