Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate the Roxygen using internal data #313

Open
chainsawriot opened this issue Aug 29, 2023 · 11 comments
Open

Generate the Roxygen using internal data #313

chainsawriot opened this issue Aug 29, 2023 · 11 comments

Comments

@chainsawriot
Copy link
Collaborator

Or else updating both the readme and the vignette would be a chore.

@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 3, 2023

Also useful for this. This should be datafied.

rio/R/utils.R

Lines 33 to 110 in 6af9104

type_list <- list(
clipboard = "clipboard",
# supported formats
"," = "csv",
";" = "csv2",
"\t" = "tsv",
"|" = "psv",
arff = "arff",
csv = "csv",
csv2 = "csv2",
csvy = "csvy",
dbf = "dbf",
dif = "dif",
dta = "dta",
dump = "dump",
epiinfo = "rec",
excel = "xlsx",
feather = "feather",
fortran = "fortran",
fst = "fst",
fwf = "fwf",
htm = "html",
html = "html",
json = "json",
mat = "matlab",
matlab = "matlab",
minitab = "mtp",
mtp = "mtp",
ods = "ods",
por = "spss",
psv = "psv",
r = "r",
rda = "rdata",
rdata = "rdata",
rds = "rds",
rec = "rec",
sas = "sas7bdat",
sas7bdat = "sas7bdat",
sav = "sav",
spss = "sav",
stata = "dta",
syd = "syd",
systat = "syd",
tsv = "tsv",
txt = "tsv",
weka = "arff",
xls = "xls",
xlsx = "xlsx",
xml = "xml",
xport = "xpt",
xpt = "xpt",
yaml = "yml",
yml = "yml",
eviews = "eviews",
wf1 = "eviews",
zsav = "zsav",
# compressed formats
csv.gz = "gzip",
csv.gzip = "gzip",
gz = "gzip",
gzip = "gzip",
tar = "tar",
zip = "zip",
# known but unsupported formats
bib = "bib",
bibtex = "bib",
bmp = "bmp",
gexf = "gexf",
gnumeric = "gnumeric",
jpeg = "jpg",
jpg = "jpg",
npy = "npy",
png = "png",
sdmx = "sdmx",
sss = "sss",
tif = "tiff",
tiff = "tiff"
)

@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 3, 2023

  • catalogue how many sources of truth are there.

@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 3, 2023

rio/README.Rmd

Lines 94 to 132 in 6af9104

| Format | Typical Extension | Import Package | Export Package | Installed by Default |
| ------ | --------- | -------------- | -------------- | -------------------- |
| Comma-separated data | .csv | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| Pipe-separated data | .psv | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| Tab-separated data | .tsv | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| CSVY (CSV + YAML metadata header) | .csvy | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| SAS | .sas7bdat | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) (but [deprecated](https://github.com/tidyverse/haven/issues/224)) | Yes |
| SPSS | .sav | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SPSS (compressed) | .zsav | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| Stata | .dta | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SAS XPORT | .xpt | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SPSS Portable | .por | [**haven**](https://cran.r-project.org/package=haven) | | Yes |
| Excel | .xls | [**readxl**](https://cran.r-project.org/package=readxl) | | Yes |
| Excel | .xlsx | [**readxl**](https://cran.r-project.org/package=readxl) | [**openxlsx**](https://cran.r-project.org/package=openxlsx) | Yes |
| R syntax | .R | **base** | **base** | Yes |
| Saved R objects | .RData, .rda | **base** | **base** | Yes |
| Serialized R objects | .rds | **base** | **base** | Yes |
| Epiinfo | .rec | [**foreign**](https://cran.r-project.org/package=foreign) | | Yes |
| Minitab | .mtp | [**foreign**](https://cran.r-project.org/package=foreign) | | Yes |
| Systat | .syd | [**foreign**](https://cran.r-project.org/package=foreign) | | Yes |
| "XBASE" database files | .dbf | [**foreign**](https://cran.r-project.org/package=foreign) | [**foreign**](https://cran.r-project.org/package=foreign) | Yes |
| Weka Attribute-Relation File Format | .arff | [**foreign**](https://cran.r-project.org/package=foreign) | [**foreign**](https://cran.r-project.org/package=foreign) | Yes |
| Data Interchange Format | .dif | **utils** | | Yes |
| Fortran data | no recognized extension | **utils** | | Yes |
| Fixed-width format data | .fwf | **utils** | **utils** | Yes |
| gzip comma-separated data | .csv.gz | **utils** | **utils** | Yes |
| Apache Arrow (Parquet) | .parquet | [**arrow**](https://cran.r-project.org/package=arrow) | [**arrow**](https://cran.r-project.org/package=arrow) | No |
| EViews | .wf1 | [**hexView**](https://cran.r-project.org/package=hexView) | | No |
| Feather R/Python interchange format | .feather | [**feather**](https://cran.r-project.org/package=feather) | [**feather**](https://cran.r-project.org/package=feather) | No |
| Fast Storage | .fst | [**fst**](https://cran.r-project.org/package=fst) | [**fst**](https://cran.r-project.org/package=fst) | No |
| JSON | .json | [**jsonlite**](https://cran.r-project.org/package=jsonlite) | [**jsonlite**](https://cran.r-project.org/package=jsonlite) | No |
| Matlab | .mat | [**rmatio**](https://cran.r-project.org/package=rmatio) | [**rmatio**](https://cran.r-project.org/package=rmatio) | No |
| OpenDocument Spreadsheet | .ods | [**readODS**](https://cran.r-project.org/package=readODS) | [**readODS**](https://cran.r-project.org/package=readODS) | No |
| HTML Tables | .html | [**xml2**](https://cran.r-project.org/package=xml2) | [**xml2**](https://cran.r-project.org/package=xml2) | No |
| Shallow XML documents | .xml | [**xml2**](https://cran.r-project.org/package=xml2) | [**xml2**](https://cran.r-project.org/package=xml2) | No |
| YAML | .yml | [**yaml**](https://cran.r-project.org/package=yaml) | [**yaml**](https://cran.r-project.org/package=yaml) | No |
| Clipboard | default is tsv | [**clipr**](https://cran.r-project.org/package=clipr) | [**clipr**](https://cran.r-project.org/package=clipr) | No |
| [Google Sheets](https://www.google.com/sheets/about/) | as Comma-separated data | | | |
| Graphpad Prism | .pzfx | [**pzfx**](https://cran.r-project.org/package=pzfx) | [**pzfx**](https://cran.r-project.org/package=pzfx) | No |

rio/vignettes/rio.Rmd

Lines 36 to 73 in 6af9104

| Format | Typical Extension | Import Package | Export Package | Installed by Default |
| ------ | --------- | -------------- | -------------- | -------------------- |
| Comma-separated data | .csv | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| Pipe-separated data | .psv | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| Tab-separated data | .tsv | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| CSVY (CSV + YAML metadata header) | .csvy | [**data.table**](https://cran.r-project.org/package=data.table) | [**data.table**](https://cran.r-project.org/package=data.table) | Yes |
| SAS | .sas7bdat | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SPSS | .sav | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SPSS (compressed) | .zsav | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| Stata | .dta | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SAS XPORT | .xpt | [**haven**](https://cran.r-project.org/package=haven) | [**haven**](https://cran.r-project.org/package=haven) | Yes |
| SPSS Portable | .por | [**haven**](https://cran.r-project.org/package=haven) | | Yes |
| Excel | .xls | [**readxl**](https://cran.r-project.org/package=readxl) | | Yes |
| Excel | .xlsx | [**readxl**](https://cran.r-project.org/package=readxl) | [**openxlsx**](https://cran.r-project.org/package=openxlsx) | Yes |
| R syntax | .R | **base** | **base** | Yes |
| Saved R objects | .RData, .rda | **base** | **base** | Yes |
| Serialized R objects | .rds | **base** | **base** | Yes |
| Epiinfo | .rec | [**foreign**](https://cran.r-project.org/package=foreign) | | Yes |
| Minitab | .mtp | [**foreign**](https://cran.r-project.org/package=foreign) | | Yes |
| Systat | .syd | [**foreign**](https://cran.r-project.org/package=foreign) | | Yes |
| "XBASE" database files | .dbf | [**foreign**](https://cran.r-project.org/package=foreign) | [**foreign**](https://cran.r-project.org/package=foreign) | Yes |
| Weka Attribute-Relation File Format | .arff | [**foreign**](https://cran.r-project.org/package=foreign) | [**foreign**](https://cran.r-project.org/package=foreign) | Yes |
| Data Interchange Format | .dif | **utils** | | Yes |
| Fortran data | no recognized extension | **utils** | | Yes |
| Fixed-width format data | .fwf | **utils** | **utils** | Yes |
| gzip comma-separated data | .csv.gz | **utils** | **utils** | Yes |
| Apache Arrow (Parquet) | .parquet | [**arrow**](https://cran.r-project.org/package=arrow) | [**arrow**](https://cran.r-project.org/package=arrow) | No |
| EViews | .wf1 | [**hexView**](https://cran.r-project.org/package=hexView) | | No |
| Feather R/Python interchange format | .feather | [**feather**](https://cran.r-project.org/package=feather) | [**feather**](https://cran.r-project.org/package=feather) | No |
| Fast Storage | .fst | [**fst**](https://cran.r-project.org/package=fst) | [**fst**](https://cran.r-project.org/package=fst) | No |
| JSON | .json | [**jsonlite**](https://cran.r-project.org/package=jsonlite) | [**jsonlite**](https://cran.r-project.org/package=jsonlite) | No |
| Matlab | .mat | [**rmatio**](https://cran.r-project.org/package=rmatio) | [**rmatio**](https://cran.r-project.org/package=rmatio) | No |
| OpenDocument Spreadsheet | .ods | [**readODS**](https://cran.r-project.org/package=readODS) | [**readODS**](https://cran.r-project.org/package=readODS) | No |
| HTML Tables | .html | [**xml2**](https://cran.r-project.org/package=xml2) | [**xml2**](https://cran.r-project.org/package=xml2) | No |
| Shallow XML documents | .xml | [**xml2**](https://cran.r-project.org/package=xml2) | [**xml2**](https://cran.r-project.org/package=xml2) | No |
| YAML | .yml | [**yaml**](https://cran.r-project.org/package=yaml) | [**yaml**](https://cran.r-project.org/package=yaml) | No |
| Clipboard | default is tsv | [**clipr**](https://cran.r-project.org/package=clipr) | [**clipr**](https://cran.r-project.org/package=clipr) | No |
| [Google Sheets](https://www.google.com/sheets/about/) | as Comma-separated data | | | |

rio/R/utils.R

Lines 33 to 110 in 6af9104

type_list <- list(
clipboard = "clipboard",
# supported formats
"," = "csv",
";" = "csv2",
"\t" = "tsv",
"|" = "psv",
arff = "arff",
csv = "csv",
csv2 = "csv2",
csvy = "csvy",
dbf = "dbf",
dif = "dif",
dta = "dta",
dump = "dump",
epiinfo = "rec",
excel = "xlsx",
feather = "feather",
fortran = "fortran",
fst = "fst",
fwf = "fwf",
htm = "html",
html = "html",
json = "json",
mat = "matlab",
matlab = "matlab",
minitab = "mtp",
mtp = "mtp",
ods = "ods",
por = "spss",
psv = "psv",
r = "r",
rda = "rdata",
rdata = "rdata",
rds = "rds",
rec = "rec",
sas = "sas7bdat",
sas7bdat = "sas7bdat",
sav = "sav",
spss = "sav",
stata = "dta",
syd = "syd",
systat = "syd",
tsv = "tsv",
txt = "tsv",
weka = "arff",
xls = "xls",
xlsx = "xlsx",
xml = "xml",
xport = "xpt",
xpt = "xpt",
yaml = "yml",
yml = "yml",
eviews = "eviews",
wf1 = "eviews",
zsav = "zsav",
# compressed formats
csv.gz = "gzip",
csv.gzip = "gzip",
gz = "gzip",
gzip = "gzip",
tar = "tar",
zip = "zip",
# known but unsupported formats
bib = "bib",
bibtex = "bib",
bmp = "bmp",
gexf = "gexf",
gnumeric = "gnumeric",
jpeg = "jpg",
jpg = "jpg",
npy = "npy",
png = "png",
sdmx = "sdmx",
sss = "sss",
tif = "tiff",
tiff = "tiff"
)

rio/R/extensions.R

Lines 25 to 45 in 6af9104

out <- switch(fmt,
bean = sprintf(xA, fmt, "ledger", "ledger"),
beancount = sprintf(xA, fmt, "ledger", "ledger"),
bib = sprintf(x, fmt, "bib2df::bib2df"),
bmp = sprintf(x, fmt, "bmp::read.bmp"),
doc = sprintf(x, fmt, "docxtractr::docx_extract_all_tbls"),
docx = sprintf(x, fmt, "docxtractr::docx_extract_all_tbls"),
gexf = sprintf(x, fmt, "rgexf::read.gexf"),
gnumeric = sprintf(x, fmt, "gnumeric::read.gnumeric.sheet"),
hledger = sprintf(xA, fmt, "ledger", "ledger"),
jpeg = sprintf(x, fmt, "jpeg::readJPEG"),
jpg = sprintf(x, fmt, "jpeg::readJPEG"),
ledger = sprintf(xA, fmt, "ledger", "ledger"),
npy = sprintf(x, fmt, "RcppCNPy::npyLoad"),
qs = sprintf(x, fmt, "qs::qread"),
pdf = sprintf(x, fmt, "tabulizer::extract_tables"),
png = sprintf(x, fmt, "png::readPNG"),
sdmx = sprintf(x, fmt, "sdmx::readSDMX"),
sss = sprintf(x, fmt, "sss::read.sss"),
tiff = sprintf(x, fmt, "tiff::readTIFF"),
gettext("Format not supported"))

rio/R/extensions.R

Lines 59 to 67 in 6af9104

out <- switch(fmt,
gexf = sprintf(x, fmt, "rgexf::write.gexf"),
jpg = sprintf(x, fmt, "jpeg::writeJPEG"),
npy = sprintf(x, fmt, "RcppCNPy::npySave"),
png = sprintf(x, fmt, "png::writePNG"),
qs = sprintf(x, fmt, "qs::qsave"),
tiff = sprintf(x, fmt, "tiff::writeTIFF"),
xpt = sprintf(x, fmt, "SASxport::write.xport"),
gettext("Format not supported"))

Documentation also contains the repeated information, but probably difficult to limit this to just one source.

rio/R/export.R

Lines 15 to 43 in 6af9104

#' \itemize{
#' \item Comma-separated data (.csv), using [data.table::fwrite()] or, if `fwrite = TRUE`, [utils::write.table()] with `row.names = FALSE`.
#' \item Pipe-separated data (.psv), using [data.table::fwrite()] or, if `fwrite = TRUE`, [utils::write.table()] with `sep = '|'` and `row.names = FALSE`.
#' \item Tab-separated data (.tsv), using [data.table::fwrite()] or, if `fwrite = TRUE`, [utils::write.table()] with `row.names = FALSE`.
#' \item SAS (.sas7bdat), using [haven::write_sas()].
#' \item SAS XPORT (.xpt), using [haven::write_xpt()].
#' \item SPSS (.sav), using [haven::write_sav()]
#' \item SPSS compressed (.zsav), using [haven::write_sav()]
#' \item Stata (.dta), using [haven::write_dta()]. Note that variable/column names containing dots (.) are not allowed and will produce an error.
#' \item Excel (.xlsx), using [openxlsx::write.xlsx()]. Existing workbooks are overwritten unless `which` is specified, in which case only the specified sheet (if it exists) is overwritten. If the file exists but the `which` sheet does not, data are added as a new sheet to the existing workbook. `x` can also be a list of data frames; the list entry names are used as sheet names.
#' \item R syntax object (.R), using [base::dput()] (by default) or [base::dump()] (if `format = 'dump'`)
#' \item Saved R objects (.RData,.rda), using [base::save()]. In this case, `x` can be a data frame, a named list of objects, an R environment, or a character vector containing the names of objects if a corresponding `envir` argument is specified.
#' \item Serialized R objects (.rds), using [base::saveRDS()]. In this case, `x` can be any serializable R object.
#' \item "XBASE" database files (.dbf), using [foreign::write.dbf()]
#' \item Weka Attribute-Relation File Format (.arff), using [foreign::write.arff()]
#' \item Fixed-width format data (.fwf), using [utils::write.table()] with `row.names = FALSE`, `quote = FALSE`, and `col.names = FALSE`
#' \item gzip comma-separated data (.csv.gz), using [utils::write.table()] with `row.names = FALSE`
#' \item [CSVY](https://github.com/csvy) (CSV with a YAML metadata header) using [data.table::fwrite()].
#' \item Apache Arrow Parquet (.parquet), using [arrow::write_parquet()]
#' \item Feather R/Python interchange format (.feather), using [feather::write_feather()]
#' \item Fast storage (.fst), using [fst::write.fst()]
#' \item JSON (.json), using [jsonlite::toJSON()]. In this case, `x` can be a variety of R objects, based on class mapping conventions in this paper: [https://arxiv.org/abs/1403.2805](https://arxiv.org/abs/1403.2805).
#' \item Matlab (.mat), using [rmatio::write.mat()]
#' \item OpenDocument Spreadsheet (.ods), using [readODS::write_ods()]. (Currently only single-sheet exports are supported.)
#' \item HTML (.html), using a custom method based on [xml2::xml_add_child()] to create a simple HTML table and [xml2::write_xml()] to write to disk.
#' \item XML (.xml), using a custom method based on [xml2::xml_add_child()] to create a simple XML tree and [xml2::write_xml()] to write to disk.
#' \item YAML (.yml), using [yaml::write_yaml()], default to write the content with UTF-8. Might not work on some older systems, e.g. default Windows locale for R <= 4.2.
#' \item Clipboard export (on Windows and Mac OS), using [utils::write.table()] with `row.names = FALSE`
#' }

rio/R/import.R

Lines 14 to 51 in 6af9104

#' \itemize{
#' \item Comma-separated data (.csv), using [data.table::fread()] or, if `fread = FALSE`, [utils::read.table()] with `row.names = FALSE` and `stringsAsFactors = FALSE`
#' \item Pipe-separated data (.psv), using [data.table::fread()] or, if `fread = FALSE`, [utils::read.table()] with `sep = '|'`, `row.names = FALSE` and `stringsAsFactors = FALSE`
#' \item Tab-separated data (.tsv), using [data.table::fread()] or, if `fread = FALSE`, [utils::read.table()] with `row.names = FALSE` and `stringsAsFactors = FALSE`
#' \item SAS (.sas7bdat), using [haven::read_sas()].
#' \item SAS XPORT (.xpt), using [haven::read_xpt()] or, if `haven = FALSE`, [foreign::read.xport()].
#' \item SPSS (.sav), using [haven::read_sav()]. If `haven = FALSE`, [foreign::read.spss()] can be used.
#' \item SPSS compressed (.zsav), using [haven::read_sav()].
#' \item Stata (.dta), using [haven::read_dta()]. If `haven = FALSE`, [foreign::read.dta()] can be used.
#' \item SPSS Portable Files (.por), using [haven::read_por()].
#' \item Excel (.xls and .xlsx), using [readxl::read_excel()]. Use `which` to specify a sheet number. For .xlsx files, it is possible to set `readxl = FALSE`, so that [openxlsx::read.xlsx()] can be used instead of readxl (the default).
#' \item R syntax object (.R), using [base::dget()]
#' \item Saved R objects (.RData,.rda), using [base::load()] for single-object .Rdata files. Use `which` to specify an object name for multi-object .Rdata files. This can be any R object (not just a data frame).
#' \item Serialized R objects (.rds), using [base::readRDS()]. This can be any R object (not just a data frame).
#' \item Epiinfo (.rec), using [foreign::read.epiinfo()]
#' \item Minitab (.mtp), using [foreign::read.mtp()]
#' \item Systat (.syd), using [foreign::read.systat()]
#' \item "XBASE" database files (.dbf), using [foreign::read.dbf()]
#' \item Weka Attribute-Relation File Format (.arff), using [foreign::read.arff()]
#' \item Data Interchange Format (.dif), using [utils::read.DIF()]
#' \item Fortran data (no recognized extension), using [utils::read.fortran()]
#' \item Fixed-width format data (.fwf), using a faster version of [utils::read.fwf()] that requires a `widths` argument and by default in rio has `stringsAsFactors = FALSE`. If `readr = TRUE`, import will be performed using [readr::read_fwf()], where `widths` should be: `NULL`, a vector of column widths, or the output of [readr::fwf_empty()], [readr::fwf_widths()], or [readr::fwf_positions()].
#' \item gzip comma-separated data (.csv.gz), using [utils::read.table()] with `row.names = FALSE` and `stringsAsFactors = FALSE`
#' \item [CSVY](https://github.com/csvy) (CSV with a YAML metadata header) using [data.table::fread()].
#' \item Apache Arrow Parquet (.parquet), using [arrow::read_parquet()]
#' \item Feather R/Python interchange format (.feather), using [feather::read_feather()]
#' \item Fast storage (.fst), using [fst::read.fst()]
#' \item JSON (.json), using [jsonlite::fromJSON()]
#' \item Matlab (.mat), using [rmatio::read.mat()]
#' \item EViews (.wf1), using [hexView::readEViews()]
#' \item OpenDocument Spreadsheet (.ods), using [readODS::read_ods()]. Use `which` to specify a sheet number.
#' \item Single-table HTML documents (.html), using [xml2::read_html()]. There is no standard HTML table and we have only tested this with HTML tables exported with this package. HTML tables will only be read correctly if the HTML file can be converted to a list via [xml2::as_list()]. This import feature is not robust, especially for HTML tables in the wild. Please use a proper web scraping framework, e.g. `rvest`.
#' \item Shallow XML documents (.xml), using [xml2::read_xml()]. The data structure will only be read correctly if the XML file can be converted to a list via [xml2::as_list()].
#' \item YAML (.yml), using [yaml::yaml.load()]
#' \item Clipboard import, using [utils::read.table()] with `row.names = FALSE`
#' \item Google Sheets, as Comma-separated data (.csv)
#' \item GraphPad Prism (.pzfx) using [pzfx::read_pzfx()]
#' }

@chainsawriot
Copy link
Collaborator Author

With this can also remove this parsing of DESCRIPTION

rio/R/suggestions.R

Lines 21 to 51 in 442b7d3

uninstalled_formats <- function() {
# Suggested packages (robust to changes in DESCRIPTION file)
# Instead of flagging *new* suggestions by hand, this method only requires
# flagging *non-import* suggestions (such as `devtools`, `knitr`, etc.).
# This could be even more robust if the call to `install_formats()` instead
# wrapped a call to `<devools|remotes>::install_deps(dependencies =
# "Suggests")`, since this retains the package versioning (e.g. `xml2 (>=
# 1.2.0)`) suggested in the `DESCRIPTION` file. However, this seems a bit
# recursive, as `devtools` or `remotes` are often also in the `Suggests`
# field.
suggestions <- read.dcf(system.file("DESCRIPTION", package = utils::packageName(), mustWork = TRUE), fields = "Suggests")
suggestions <- parse_suggestions(suggestions)
common_suggestions <- c("bit64", "datasets", "devtools", "knitr", "magrittr", "testthat")
suggestions <- setdiff(suggestions, common_suggestions)
# which are not installed
unlist(lapply(suggestions, function(x) {
if (length(find.package(x, quiet = TRUE))) {
NULL
} else {
x
}
}))
}
parse_suggestions <- function(suggestions) {
suggestions <- unlist(strsplit(suggestions, split = ",|, |\n"))
suggestions <- gsub("\\s*\\(.*\\)", "", suggestions)
suggestions <- sort(suggestions[suggestions != ""])
suggestions
}

@chainsawriot chainsawriot self-assigned this Sep 5, 2023
@chainsawriot
Copy link
Collaborator Author

jsonlite::toJSON(data.frame(extension = c("csv", "psv"), format = c("Comma-separated data", "Pipe-separated data"), import = c("data.table::fread", "data.table::fread"), export = c("data.table::fwrite", "data.table::fread"), type = c("Imports", "Imports"), note = c("", "")))
[
{"extension":"csv",
"format":"Comma-separated data",
"import":"data.table::fread",
"export":"data.table::fwrite",
"type":"Imports",
"note":""},

{"extension":"psv",
"format":"Pipe-separated data",
"import":"data.table::fread",
"export":"data.table::fread",
"type":"Imports",
"note":""}
]

We need types: Imports (install by default), Suggests (suggests), Enhances (e.g. ledger), Known (e.g. bib)

@chainsawriot
Copy link
Collaborator Author

rio/R/utils.R

Line 38 in 51e4421

"\t" = "tsv",

Undocumented, and pita to make this work for knitr::kable.

chainsawriot added a commit that referenced this issue Sep 7, 2023
* Implement a single source of truth [no ci]

* Clean up convert.R [no ci]

* Update NEWS
@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 7, 2023

TODO

  • Make get_type using the internal data
  • Make .export.default using the internal data
  • Make .import.default using the internal data
  • Make uninstalled_format() using the internal data (rather than parsing DESCRIPTION)

Research

  • Is it possible to generate the Roxygen using the internal data too?

Doc

  • Write a short guide on how to update the internal data

@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 8, 2023

@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 8, 2023

  • Remove ext in internal data (b/c it's confusing)

@chainsawriot
Copy link
Collaborator Author

chainsawriot commented Sep 8, 2023

#123

  • gz and gzip (import) are not handled by tar; export is.

@chainsawriot chainsawriot changed the title Save the supported file formats table in a machine readable format and make it a single source of truth Generate the Roxygen using internal data Sep 8, 2023
@chainsawriot chainsawriot removed the v1.0 label Sep 8, 2023
@chainsawriot
Copy link
Collaborator Author

All done, except the research point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant