Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rix: Reproducible Environments with Nix #625

Open
12 of 29 tasks
b-rodrigues opened this issue Feb 6, 2024 · 46 comments
Open
12 of 29 tasks

rix: Reproducible Environments with Nix #625

b-rodrigues opened this issue Feb 6, 2024 · 46 comments

Comments

@b-rodrigues
Copy link

b-rodrigues commented Feb 6, 2024

Submitting Author Name: Bruno Rodrigues
Submitting Author Github Handle: @b-rodrigues
Other Package Authors Github handles: @philipp-baumann
Repository: https://github.com/b-rodrigues/rix
Version submitted: 0.60
Submission type: Standard
Editor: @ldecicco-USGS
Reviewers: @wdwatkins, @assignUser

Archive: TBD
Version accepted: TBD
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Package: rix
Title: Rix: Reproducible Environments with Nix
Version: 0.6.0
Authors@R: c(
    person("Bruno", "Rodrigues", , "bruno@brodrigues.co", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0002-3211-3689")),
    person("Philipp", "Baumann", , "baumann-philipp@protonmail.com", role = "aut",
           comment = c(ORCID = "0000-0002-3194-8975"))
  )
Description: Provides helper functions to create reproducible development
    environments using the Nix package manager.
License: GPL (>= 3)
URL: https://b-rodrigues.github.io/rix/
BugReports: https://github.com/b-rodrigues/rix
Depends: 
    R (>= 2.10)
Imports: 
    codetools,
    httr,
    jsonlite,
    sys,
    utils
Suggests: 
    knitr,
    rmarkdown,
    testthat
VignetteBuilder: 
    knitr
Config/fusen/version: 0.5.2
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences):

The package makes it easy to define default.nix files which can then be use by the Nix package manager to build a reproducible development environment. This is especially useful to make workflows completely reproducible.

  • Who is the target audience and what are scientific applications of this package?

This package will be useful to anyone with very strict reproducibility requirements.

Not quite: to replace what {rix} provides one would need to combine Docker and {renv} or {groundhog}.

Not applicable.

  • If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

#624

@jhollist @mpadge

  • Explain reasons for any pkgcheck items which your package is unable to pass.

We are unable to have the coverage of 75%. This is due to several issues:

  1. Some tests requires Nix to run. For these tests, we have a "skip if nix is not available" test.
  2. Some tests make {covr} completely fail, even if they have the skip_on_covr() function, but on only on GA. On our computers, covr::package_coverage() runs and these tests are successfully skipped.
  3. We have several snapshot tests, and these simply seem to be ignored by covr.

We have 26 tests, which all pass on different configurations: on Ubuntu, macOS and Windows, with or without Nix available (some tests get skipped if Nix is not available) and whether R itself is installed with Nix as well, or not (in other words, through the usual means for the operating system in use).

Because of these issues, our coverage is quite low:

> rix Coverage: 10.87%
R/get_latest.R: 0.00%
R/nix_build.R: 0.00%
R/rix.R: 0.00%
R/tar_nix_ga.R: 0.00%
R/with_nix.R: 0.68%
R/rix_init.R: 2.84%
R/detect_os.R: 28.57%
R/fetchgit.R: 90.00%
R/find_rev.R: 92.31%
R/available_r.R: 100.00%
R/detect_versions.R: 100.00%
R/get_sri_hash_deps.R: 100.00%

but if our snapshot tests (for the functions rix(), and tar_nix_ga()) would not be ignored, and if covr would successfully run the tests for with_nix(), our coverage would be more than 50%, we think. Some functions are very difficult to test, for example get_latest(), which gets the latest commit hash of the nixpkgs github repository.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?

  • Do you intend for this package to go on Bioconductor?

  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options
  • The package is novel and will be of interest to the broad readership of the journal.
  • The manuscript describing the package is no longer than 3000 words.
  • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
  • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
  • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
  • (Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for rix (v0.6.0)

git hash: e7b673cb

  • ✔️ Package name is available
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✖️ Package coverage failed
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: GPL (>= 3)


1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 199
internal rix 66
imports codetools 5
imports httr 4
imports utils 3
imports sys 2
imports jsonlite 1
suggests knitr NA
suggests rmarkdown NA
suggests testthat NA
linking_to NA NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

lapply (15), Sys.getenv (14), is.null (11), paste0 (9), Filter (8), file.path (7), list (7), names (7), for (6), get (6), grep (6), paste (6), seq_along (6), unlist (6), vapply (6), all (4), c (4), grepl (4), if (4), nzchar (4), args (3), character (3), emptyenv (3), switch (3), any (2), as.list (2), call (2), environmentName (2), getOption (2), gsub (2), is.function (2), new.env (2), setdiff (2), Sys.which (2), url (2), as.character (1), as.name (1), bquote (1), deparse (1), do.call (1), file (1), file.exists (1), formals (1), format (1), identical (1), length (1), logical (1), Map (1), match (1), match.call (1), normalizePath (1), readRDS (1), source (1), split (1), strsplit (1), Sys.info (1), Sys.time (1), system.file (1), tempdir (1), unname (1)

rix

classify_globals (3), deparse_chr1 (3), get_sri_hash_deps (3), detect_versions (2), find_rev (2), get_globals_exprs (2), get_rPackages (2), get_rprofile_text (2), is_nix_rsession (2), is_rstudio_session (2), nix_build_exit_msg (2), set_nix_path (2), where (2), available_r (1), check_expr (1), create_shell_nix (1), detect_os (1), fetchgit (1), fetchgits (1), fetchpkgs (1), fetchzip (1), fetchzips (1), fix_ld_library_path (1), generate_git_archived_packages (1), generate_header (1), generate_locale_archive (1), generate_locale_variables (1), generate_rix_call (1), generate_rpkgs (1), generate_rstudio_pkgs (1), generate_shell (1), generate_system_pkgs (1), generate_tex_pkgs (1), get_expr_extra_pkgs (1), get_latest (1), get_system_pkgs (1), is_empty (1), is_integerish (1), message_rprofile (1), nix_build (1), nix_build_installed (1), nix_rprofile (1), nix_shell_available (1), poll_sys_proc_nonblocking (1), quote_rnix (1), recurse_find_check_globals (1), serialize_globals (1), serialize_pkgs (1), with_assign_vec_call (1), with_assign_vecnames_call (1)

codetools

findGlobals (3), checkUsage (2)

httr

GET (3), content (1)

utils

timestamp (2), packageVersion (1)

sys

exec_status (2)

jsonlite

fromJSON (1)


2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 14 files) and
  • 2 authors
  • 12 vignettes
  • 1 internal data file
  • 5 imported packages
  • 6 exported functions (median 66 lines of code)
  • 104 non-exported functions in R (median 11 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 14 70.8
files_vignettes 12 99.6
files_tests 18 95.7
loc_R 1250 73.8
loc_vignettes 1689 96.2 TRUE
loc_tests 639 79.6
num_vignettes 12 99.9 TRUE
data_size_total 1318 61.2
data_size_median 1318 65.8
n_fns_r 110 78.8
n_fns_r_exported 6 29.1
n_fns_r_not_exported 104 84.5
n_fns_per_file_r 5 70.1
num_params_per_fn 4 67.3
loc_per_fn_r 12 36.1
loc_per_fn_r_exp 66 85.8
loc_per_fn_r_not_exp 11 35.4
rel_whitespace_R 20 76.4
rel_whitespace_vignettes 29 97.1 TRUE
rel_whitespace_tests 21 78.5
doclines_per_fn_exp 66 77.6
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 79 75.2

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


3. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml
pkgcheck

GitHub Workflow Results

id name conclusion sha run_number date
7802080928 devtools-tests-via-r-nix success e7b673 69 2024-02-06
7802080935 nix-builder success e7b673 385 2024-02-06
7802119374 pages build and deployment success ef12c6 276 2024-02-06
7802080938 pkgdown success e7b673 532 2024-02-06
7802080934 R-CMD-check success e7b673 528 2024-02-06
7801340546 ropensci-pkgcheck failure 27a828 64 2024-02-06
7802080931 tests-r-via-system success e7b673 66 2024-02-06

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following note:

  1. checking package subdirectories ... NOTE
    Problems with news in ‘NEWS.md’:
    Cannot extract version info from the following section titles:
    rix (development version)

Test coverage with covr

ERROR: Test Coverage Failed

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function cyclocomplexity
with_nix 27
rix 22

Static code analyses with lintr

lintr found the following 108 potential issues:

message number of times
Avoid library() and require() calls in packages 19
Lines should not be more than 80 characters. 87
unexpected '/' 1
Use <-, not =, for assignment. 1


Package Versions

package version
pkgstats 0.1.3.9
pkgcheck 0.1.2.14


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@ldecicco-USGS
Copy link

@ropensci-review-bot assign @ldecicco-USGS as editor

@ropensci-review-bot
Copy link
Collaborator

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@ropensci-review-bot help

@ldecicco-USGS
Copy link

@ropensci-review-bot assign @ldecicco-USGS as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @ldecicco-USGS is now the editor

@ldecicco-USGS
Copy link

Editor checks:

  • Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package. In particular,
    • Is the case for the package well made?
    • Is the reference index page clear (grouped by topic if necessary)?
    • Are vignettes readable, sufficiently detailed and not just perfunctory?
  • Fit: The package meets criteria for fit and overlap.
  • Installation instructions: Are installation instructions clear enough for human users?
  • Tests: If the package has some interactivity / HTTP / plot production etc. are the tests using state-of-the-art tooling?
  • Contributing information: Is the documentation for contribution clear enough e.g. tokens for tests, playgrounds?
  • License: The package has a CRAN or OSI accepted license.
  • Project management: Are the issue and PR trackers in a good shape, e.g. are there outstanding bugs, is it clear when feature requests are meant to be tackled?

Editor comments

Looks good! I'll begin searching for editors.

@ldecicco-USGS
Copy link

@ropensci-review-bot seeking reviewers

@ropensci-review-bot
Copy link
Collaborator

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/625_status.svg)](https://github.com/ropensci/software-review/issues/625)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

@b-rodrigues
Copy link
Author

b-rodrigues commented Feb 12, 2024

Please add this badge to the README of your package repository:

done with: b-rodrigues/rix@859fd18

@ldecicco-USGS
Copy link

@ropensci-review-bot assign @wdwatkins as reviewer

@ropensci-review-bot
Copy link
Collaborator

@wdwatkins added to the reviewers list. Review due date is 2024-03-06. Thanks @wdwatkins for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

@ropensci-review-bot
Copy link
Collaborator

@wdwatkins: If you haven't done so, please fill this form for us to update our reviewers records.

@ldecicco-USGS
Copy link

@ropensci-review-bot assign @assignUser as reviewer

@ropensci-review-bot
Copy link
Collaborator

@assignUser added to the reviewers list. Review due date is 2024-03-11. Thanks @assignUser for accepting to review! Please refer to our reviewer guide.

rOpenSci’s community is our best asset. We aim for reviews to be open, non-adversarial, and focused on improving software quality. Be respectful and kind! See our reviewers guide and code of conduct for more.

@ropensci-review-bot
Copy link
Collaborator

@assignUser: If you haven't done so, please fill this form for us to update our reviewers records.

@ropensci-review-bot
Copy link
Collaborator

📆 @wdwatkins you have 2 days left before the due date for your review (2024-03-06).

@wdwatkins
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software have been confirmed.
  • Performance: Any performance claims of the software have been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing:6

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Due to my organization's IT policies, I can only run nix in a Docker container,so all my experimenting with the actual nix libraries is done there. I can use the CLI to interact with nix,though its a bit awkward without other things installed (e.g. R). I was able to generate the default.nix files with rix, then copy them into the container to build them with shell commands.

You may need to update the nix install command at some point, just be aware if that changes

Good selection of vignettes, help files are complete.

I also found your "What is nix" blog post particularly helpful for understanding what was actually happening under the hood. It seems likely to me that a good portion of your users will also be using nix for the first time, so it is worth making sure you explain or link out to explanations of how nix itself works at an intuitive level, in the interest of usability.

From goodpractice results

Is there a reason for using = here R/with_nix.R:564:13?

code comments

I see a couple 'list versions' functions rely on a stored data file. Is there a way to get this information directly from the nix archive so you don't need to keep that up to date?

There are a fair number of internal functions that don't have inputs/outputs documented, I would suggest documenting those for your own/future maintainer's benefit down the road.

Also, several todos still in comments.

test comments

Seems like get_latest.R could be tested at least that it doesn't error and returns a hash,
outside of environments where web service requests would fail?

Re: coverage—just to make sure I understand, the 25% coverage value is ignoring the tests
that actually run nix commands since you would need an environment with nix
installed to run them, which you don't get in CRAN or other non-containerized environments?
It seems like it should be possible to make a Docker container with both R and nix
installed to get a more realistic test environment, but maybe you've done this already.

@ropensci-review-bot
Copy link
Collaborator

📆 @assignUser you have 2 days left before the due date for your review (2024-03-11).

@assignUser
Copy link

assignUser commented Mar 11, 2024

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • Briefly describe any working relationship you have (had) with the package authors.
  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need: clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s): demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions
  • Examples: (that run successfully locally) for all exported functions
  • Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.

Estimated hours spent reviewing: 6

  • Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

{rix} is a very interesting package that will allow R developers to make use of nix powerful features without having to learn (much) of its arcane secrets 🧙 . For me personally it was a great opportunity to finally get to know 'nix' (special thanks for pointing to the DS installer as I have previously destroyed a shell env via half-removed nix 😁).

While it is afaik not required by rOpenSci, I would recommend to change the default branch from 'master' to 'main'. It's a small, easy change that is well supported by GitHub but is quite meaningful.

Documentation

Overall the documentation is very good and covers basic as well as advanced features of the package and points to a plethora of external resources about nix. The README especially guides both beginner and advanced users and points to the specific useful further materials. I have made a few notes while reading/following along with the vignettes that I will list as bullet points for brevity. These are all what I would consider 'nits'/nice to haves and not show stoppers.

  • All vignettes start with a code block with a library(rix) call, which seems uneccessary (-> echo=False).
  • The 'numbering' of the vignettes being 'hardcoded' in the title is a bit unusual and I personally would remove it as the vignettes all link to their respective follow up vignettes (well except C 'Using rix'). If this is done to make sure they appear in a certain order in the pkgdown menu: this can be controlled via the _pkgdown.yml.
  • Getting Started: more paragraphs would make for a smoother read of the first block of text.
  • b2: 'generates R expressions' -> 'nix expressions'?
  • d2: Should the link to the nix package search by default be to the stable rather than unstable channel?
  • e: I tried using nvim with lsp in an nix shell and that crashed but I think that might be due to my use of mason, which provides a separate R env for the languageserver, clashing with nix. I haven't done any further investigations for now.
  • d1: It's imo superflous to require both branch and commit as a commit is unique to a repo and not a branch so the additional info of branch is not necessary. Branch could be resolved to the lates commit on that branch (not entireliy inline with literate programming but useful when working against a quickly changing internal pkg?). It would also be nice to add a short hand similar to {pak} ala owner/repo@ref (ref = tag, branch or commit)
  • Remote dependencies: I think {rix} should provide functionality to install packages with remote dependencies not found in nixpkgs. While it might be relatively rare case for packages publicly hosted on GitHub I think it is much less unusual for companies to have a number of internal packages that also depend on each other. The issue with no default way to know which commit could be solved by printing a warning and maybe the required shorthand to explicitly add the pkg to rix().
  • The developers vignettes are both quite short and could be combined into one.

Code

  • I agree with @wdwatkins regarding the test coverage.

  • There is a probably unintentional partial match in flag_git_archive: In cran_pkgs$archive : partial match of 'archive' to 'archive_pkgs'

  • I like the use of {fledge} but personally don't think {fusen} is the right choice for a technically advanced package like {rix}. It is of course preference and up to you but I think {fusen} changes the established workflow of an R package dev quite a bit. I see how it can lower the bar for people to get started with creating packages and that's great. But I think in this case the people that are likely going to contribute (experienced devs) might not do it if that means they have to spend time getting used to {fusen} vs. just adding a quick patch. (I certainly looked through R/ and not dev/)

  • The two main functions rix and with_rix and some of the helper functions are quite long and would benefit from some refactoring (a comment hinting which if() your closing is a good pointer that some refactoring is in order 😉 ). This would also remove duplicate code (e.g. in serialize_globals)

  • The large error message in with_rix will never be reached due to the preceding stopifnot, maybe combine the two?

  • I was surprised that you are not using {rlang} and quosures, it has been a while since I have used them but I would think they would be a good fit here. (but maybe not, it's been a while since I have used them 'under the hood')

  • GHAs are my specialty so some extended notes on extdata/run-pipeline.yml (most of this applies in general as well):

    • Please explicitly add the contents: write permission. This is technically not required as the token has the correct permisson but it is best practice to minimise token permissions.
      • You should also do this on the workflows in .github/workflows/ with the minimal permissions each (e.g. contents: read)
    • Always use at least tags for actions for stability, in an action that has write permission I would even highly recommend to explicitly use a commit hash. That ensures that even a malicous take over of one of the action repos does not allow a third-party to run code in your repo. Github's dependabot can handle updating shas and tags.
    • Make sure tags/hashes are updated regularly (actions/checkout is at v4 but v2 is used in the workflow).
    • secrets should always be assigned to envvars as close to where they are used as possible (usually step env:) and not for the entire workflow.
    • I would recommend to use an explicit output instead of continue-on-error as that still creates a big red error message in the job summary.
  • It seems like the codetools::check is run quite a lot, I assume it's to make debugging easier?

  • with_nix is very verbose with ~ a terminal length of 'preamble' output of checks etc., I think a 'quite' option would be nice when running it repeatedly.

  • 'hidden' bindings like {arrow}'s dplyr bindings are not forwarded to the nix shell with_nix (the workaround of explicitly calling some function in the expression you want to run is probably way easier than an actual fix, but I don't really know as I am mostly concerned with arrow's build system^^):

    library(rix)
    library(arrow)
    library(dplyr)
    arrow_cars <- arrow_table(cars)
    arrow_test <- function() {
      arrow_cars %>% 
        dplyr::filter(speed > 10)
    }
    with_nix(arrow_test())

    => Serializing package(s) required to run expr:
    dplyr

@b-rodrigues
Copy link
Author

Thank you very much for your reviews!

Until when do we have to address them?

@ldecicco-USGS
Copy link

Thanks so much @wdwatkins and @assignUser ! There's no specific due date for a response. The sooner you respond, the fresher the comments are in the reviewers' mind.

@ldecicco-USGS
Copy link

@ropensci-review-bot submit review #625 (comment) time 6

@ropensci-review-bot
Copy link
Collaborator

Logged review for assignUser (hours: 6)

@ldecicco-USGS
Copy link

@ropensci-review-bot submit review #625 (comment) time 6

@ropensci-review-bot
Copy link
Collaborator

Logged review for wdwatkins (hours: 6)

@philipp-baumann
Copy link

Thanks a lot @wdwatkins and @assignUser also from my side for the encouraging review feedback and concrete tips to improve {rix}, and @ldecicco-USGS for guiding the review process.

@ropensci-review-bot
Copy link
Collaborator

@b-rodrigues, @philipp-baumann: please post your response with @ropensci-review-bot submit response <url to issue comment> if you haven't done so already (this is an automatic reminder).

Here's the author guide for response. https://devguide.ropensci.org/authors-guide.html

@b-rodrigues
Copy link
Author

Sorry we've been slow to respond, but you'll hear from us this week!

@b-rodrigues
Copy link
Author

b-rodrigues commented Mar 28, 2024

I also found your "What is nix" blog post particularly helpful for understanding what was actually happening under the hood. It seems likely to me that a good portion of your users will also be using nix for the first time, so it is worth making sure you explain or link out to explanations of how nix itself works at an intuitive level, in the interest of usability.

I’ve added a new section to the "Getting started vignette" that includes the introductory text of the blog post: b-rodrigues/rix@7542610

From goodpractice results

Is there a reason for using = here R/with_nix.R:564:13?

No, fixed with: b-rodrigues/rix@8ed3a4a

code comments

I see a couple 'list versions' functions rely on a stored data file. Is there a way to get this information directly from the nix archive so you don't need to keep that up to date?

It would always be the same data, so we thought it might be better to provide that information with the package as a dataset. Also, it would work without an active internet connection.

There are a fair number of internal functions that don't have inputs/outputs documented, I would suggest documenting those for your own/future maintainer's benefit down the road.

Indeed, this will be done in the coming days.

Also, several todos still in comments.

These todos are more for future references of functionalities we want to add, but agreed, this should be moved to specific issues. Will address this in the coming days as well.

test comments

Seems like get_latest.R could be tested at least that it doesn't error and returns a hash, outside of environments where web service requests would fail?

Test added with: b-rodrigues/rix@14632fe
EDIT: It wasn't working but now it is: b-rodrigues/rix@30871d1

Re: coverage—just to make sure I understand, the 25% coverage value is ignoring the tests that actually run nix commands since you would need an environment with nix installed to run them, which you don't get in CRAN or other non-containerized environments? It seems like it should be possible to make a Docker container with both R and nix installed to get a more realistic test environment, but maybe you've done this already.

Actually, the issue is more complex: we do have our tests running on Github Actions with nix installed (and also without, to mimick environments with and without nix), but for some complex reason, {covr} doesn’t run the tests that require Nix, these end up failing. So can’t compute the coverage.

We’ve collected the commits in this pull request: b-rodrigues/rix#153 and we’ll likely add more commits to address @assignUser points as well

@b-rodrigues
Copy link
Author

@assignUser

regarding your suggestions for GHAs:

GHAs are my specialty so some extended notes on extdata/run-pipeline.yml (most of this applies in general as well):

Please explicitly add the contents: write permission. This is technically not required as the token has the correct permisson but it is best practice to minimise token permissions.
You should also do this on the workflows in .github/workflows/ with the minimal permissions each (e.g. contents: read)
Always use at least tags for actions for stability, in an action that has write permission I would even highly recommend to explicitly use a commit hash. That ensures that even a malicous take over of one of the action repos does not allow a third-party to run code in your repo. Github's dependabot can handle updating shas and tags.
Make sure tags/hashes are updated regularly (actions/checkout is at v4 but v2 is used in the workflow).
secrets should always be assigned to envvars as close to where they are used as possible (usually step env:) and not for the entire workflow.
I would recommend to use an explicit output instead of continue-on-error as that still creates a big red error message in the job summary.

Do you happen to perhaps have a couple examples on hand I could take inspiration from? I’m struggling a bit

@b-rodrigues
Copy link
Author

b-rodrigues commented Mar 28, 2024

Code

  • I agree with @wdwatkins regarding the test coverage.

Same answer :)

  • There is a probably unintentional partial match in flag_git_archive: In cran_pkgs$archive : partial match of 'archive' to 'archive_pkgs'

thanks, solved with b-rodrigues/rix@922a2c0

  • I like the use of {fledge} but personally don't think {fusen} is the right choice for a technically advanced package like {rix}. It is of course preference and up to you but I think {fusen} changes the established workflow of an R package dev quite a bit. I see how it can lower the bar for people to get started with creating packages and that's great. But I think in this case the people that are likely going to contribute (experienced devs) might not do it if that means they have to spend time getting used to {fusen} vs. just adding a quick patch. (I certainly looked through R/ and not dev/)

You may be right, but I think that the trade-off is worth it; fusen takes care of many annoying aspects of package development. But to meet devs not familiar with fusen halfway, I’ve changed contributing.md to state that it’s ok that PRs change files in R/ directly: b-rodrigues/rix@1264b7c

  • The large error message in with_rix will never be reached due to the preceding stopifnot, maybe combine the two?

thanks, solved with b-rodrigues/rix@854a202

The 'numbering' of the vignettes being 'hardcoded' in the title is a bit unusual and I personally would remove it as the vignettes all link to their respective follow up vignettes (well except C 'Using rix'). If this is done to make sure they appear in a certain order in the pkgdown menu: this can be controlled via the _pkgdown.yml.

The idea was for users to very quickly see and understand that we expect them to read the documentation in a certain order. At first I tried using numbers, but numbers can’t be the first character of a vignette’s name. I think that if we remove them, it’ll be harder for new users to follow.

d1: It's imo superflous to require both branch and commit as a commit is unique to a repo and not a branch so the additional info of branch is not necessary. Branch could be resolved to the lates commit on that branch (not entireliy inline with literate programming but useful when working against a quickly changing internal pkg?). It would also be nice to add a short hand similar to {pak} ala owner/repo@ref (ref = tag, branch or commit)

This is required because the underlying Nix function fetchgit requires both the repo and the commit. It might be more cumbersome from a UX perspective, but I want to force users to spell out what they want to do each time, to avoid issues down the line with reproducibility.

I’m not sure I understand this point:

  • 'hidden' bindings like {arrow}'s dplyr bindings are not forwarded to the nix shell with_nix (the workaround of explicitly calling some function in the expression you want to run is probably way easier than an actual fix, but I don't really know as I am mostly concerned with arrow's build system^^):

    library(rix)
    library(arrow)
    library(dplyr)
    arrow_cars <- arrow_table(cars)
    arrow_test <- function() {
      arrow_cars %>% 
        dplyr::filter(speed > 10)
    }
    with_nix(arrow_test())

    => Serializing package(s) required to run expr:
    dplyr

Here is a gist with a working example: https://gist.github.com/b-rodrigues/8c4e4cc1e225c5ba7d7780066120fe0a
does that answer your question?

We’ll address the other comments later this week

@b-rodrigues
Copy link
Author

  • All vignettes start with a code block with a library(rix) call, which seems uneccessary (-> echo=False).

{fusen} does that automatically, and after contacting the author, there is no way (for now, at least) to easily set "echo" to FALSE.

d2: Should the link to the nix package search by default be to the stable rather than unstable channel?

We use "unstable" throughout the package, because this is where packages are the most up-to-date. "unstable" is a bit of a misnomer though, what this means is simply that these packages aren't yet part of a point-release. But they're reviewed before being merged to unstable, so unstable provides quite stable packages (there are some exceptions of course, as nothing is perfect :) )

Remote dependencies: I think {rix} should provide functionality to install packages with remote dependencies not found in nixpkgs. While it might be relatively rare case for packages publicly hosted on GitHub I think it is much less unusual for companies to have a number of internal packages that also depend on each other. The issue with no default way to know which commit could be solved by printing a warning and maybe the required shorthand to explicitly add the pkg to rix().

I'll have to think about this some more. My take is that I would like to avoid encouraging behaviours that could lead to generating development environments that can't be reproduced anymore. If the commit is set to the latest one automatically each time the function runs, then that introduces a side-effect (on top of generating a file) that I think could lead to more problems than solutions.

@b-rodrigues
Copy link
Author

with_nix is very verbose with ~ a terminal length of 'preamble' output of checks etc., I think a 'quite' option would be nice when running it repeatedly.

What would be the best way to do so? Is there some way that we could add this without needing to touch the overall structure of the function?

@philipp-baumann
Copy link

philipp-baumann commented Apr 7, 2024

@assignUser

* I was surprised that you are not using {rlang} and quosures, it has been a while since I have used them but I would think they would be a good fit here. (but maybe not, it's been a while since I have used them 'under the hood')

When I started with_nix()I thought about it, but then decided against it. Base R has very good computing on the language foundations, which suffices for the functionality we want to provide in with_nix(). For a package that facilitates reproducibility -- although nix provides all the ways to do so -- I prefer to reduce the amount of dependencies whenever possible (we for sure also have room to improve this in current {rix} ;-)).

@philipp-baumann
Copy link

@assignUser

* It seems like the `codetools::check` is run quite a lot, I assume it's to make debugging easier?

Exactly, it's another way to get feedback on possible scoping problems of variables etc. There is minor things we can deal with and runs in code exported to the nix session: for example, we support when an expr function or the ones downstream the call stack contain global variables . Here, The codetools checkUsage() also informs but the code runs because we also export the object in the global environment of the current R session invoking with_nix(). with_nix() is able to detect and export complex call stacks, because it recursively calls globals found by findGlobals(). Along this, we also repeat checkUsage(), which "checks a single R closure".

@philipp-baumann
Copy link

@b-rodrigues @assignUser

  • I like the use of {fledge} but personally don't think {fusen} is the right choice for a technically advanced package like {rix}. It is of course preference and up to you but I think {fusen} changes the established workflow of an R package dev quite a bit. I see how it can lower the bar for people to get started with creating packages and that's great. But I think in this case the people that are likely going to contribute (experienced devs) might not do it if that means they have to spend time getting used to {fusen} vs. just adding a quick patch. (I certainly looked through R/ and not dev/)

You may be right, but I think that the trade-off is worth it; fusen takes care of many annoying aspects of package development. But to meet devs not familiar with fusen halfway, I’ve changed contributing.md to state that it’s ok that PRs change files in R/ directly: b-rodrigues/rix@1264b7c

I see there are advantages for {fusen} for smaller packages, but as things grow and we mainly want to take away complexity, I agree with @assignUser and am in favor of completely removing it in the workflow. There is fusen::inflate_all_no_check() which I mostly use use to safe time, but when developing with_nix() it meant mostly more overhead to wait when doing browser() etc. Adding it would also mean that we force newcomers and possible contributers to to use a nix development environment. I have experienced different transformation behaviors, and I believe there are also formatting changes that will come since fusen hasn't completely stabilized.

@assignUser
Copy link

What would be the best way to do so? Is there some way that we could add this without needing to touch the overall structure of the function?

Use a wrapper function around cat with a quiet or something arg that skips printing and add that to with_nix too and pass it through as the default arg. You can give that to codetools::check_usage via report too :)
(or something like that, I just had a quick glimpse at the code)

@assignUser
Copy link

If the commit is set to the latest one automatically each time the function runs, then that introduces a side-effect (on top of generating a file) that I think could lead to more problems than solutions.

I think I disagree but am not super firm on it, if this is your choice you could output the correct shorthand with a fixed sha in an error instead of a warning. That way the user experience is still much better than manually editing generated files (which just seems so counter to the mission of {rix}) while still enforcing 'nix'ness' :D

@assignUser
Copy link

@b-rodrigues for better notification :)

This is required because the underlying Nix function fetchgit requires both the repo and the commit. It might be more cumbersome from a UX perspective,

Ok in that case leave branch out as it's redundant or allow adding repo + branch only and fetch the HEAD of the branch for the user.

Looking at master I still see the stopifnot that you removed in the linked commit above?

With regards to GHA examples you can take a look in https://github.com/apache/arrow/tree/main/.github/workflows we have lot's of workflows that do a lot of things ^^ for more piece meal examples I'd recommend the docs. You can also ping me on a PR with specific questions if you are still stuck :)

@llrs
Copy link
Member

llrs commented May 6, 2024

@ldecicco-USGS I was asked to provide some feedback. I do have some comments (see reply in mastodon) but there might be more if I look into the code.

I am not sure if it is best to post it during the review or once it ends.
In any case, please do not wait for my feedback and let me know what would you (and all the involved people) prefer.

@ldecicco-USGS
Copy link

Hi @llrs ! Thanks for the offer. We do have 2 reviewers' already, I think creating an issue on the nix repository would probably be the best plan.

@philipp-baumann and @b-rodrigues are you at a point where I should be verifying that the reviewers are satisfied with your response to the reviews? It looks like there's still some back and forth happening (?), but mostly ready?

@b-rodrigues
Copy link
Author

Hi

we still need some time to work out some of the input from the reviewers, but I think we should be done by the end of the month at the latest. Thanks for your patience!

@b-rodrigues
Copy link
Author

@b-rodrigues for better notification :)

This is required because the underlying Nix function fetchgit requires both the repo and the commit. It might be more cumbersome from a UX perspective,

Ok in that case leave branch out as it's redundant or allow adding repo + branch only and fetch the HEAD of the branch for the user.

Looking at master I still see the stopifnot that you removed in the linked commit above?

With regards to GHA examples you can take a look in https://github.com/apache/arrow/tree/main/.github/workflows we have lot's of workflows that do a lot of things ^^ for more piece meal examples I'd recommend the docs. You can also ping me on a PR with specific questions if you are still stuck :)

So I've tried with the following commits:

b-rodrigues/rix@82890b3

b-rodrigues/rix@3b8335f

b-rodrigues/rix@fb4a8f2

b-rodrigues/rix@e3966b0

b-rodrigues/rix@a93771b

is that ok like this?

@assignUser
Copy link

is that ok like this?

Yes, that looks good 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants