Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the archive #122

Open
andrie opened this issue Jun 29, 2019 · 7 comments
Open

Implement the archive #122

andrie opened this issue Jun 29, 2019 · 7 comments

Comments

@andrie
Copy link
Owner

andrie commented Jun 29, 2019

If a new version of a package gets added to the repo, move the old version to the archive

@achubaty
Copy link
Collaborator

Based on CRAN, the directory structure of the archive should look like /src/contrib/Archive/pkgName/pkgName_x.y.z.tar.gz. So each package gets a subdirectory for itself in Archive, and within a package subdir exists the source tarballs of previous versions.

Implementation considerations:

  1. When adding a package to the miniCRAN repo, should any/all old versions be added?
  2. addOldPackage currently it puts an old version of a downloaded package into the "current" version in the miniCRAN repo. Once an archive is implemented, this doesn't seem like the correct behaviour.

There are probably other points that need to be resolved before going ahead with implementation.

@stefanoborini
Copy link

stefanoborini commented Dec 17, 2019

One point worth of notice is that, all it boils down to how devtools actually looks for archived information. I checked the source code and apparently it's not only about creating an Archive/package/package_version.tar.gz. There's also a directory "Meta" that contains the following files:

aliases.rds
archive.rds
current.rds
rdxrefs.rds

I am using a very old version of devtools to be fair, but the problem seems to be here

https://rdrr.io/cran/remotes/src/R/install-version.R

package_find_repo <- function(package, repos) {
  for (repo in repos) {
    if (length(repos) > 1)
      message("Trying ", repo)

    archive <-
      tryCatch({
        con <- gzcon(url(sprintf("%s/src/contrib/Meta/archive.rds", repo), "rb"))
        on.exit(close(con))
        readRDS(con)
      },
      warning = function(e) list(),
      error = function(e) list())

    info <- archive[[package]]
    if (!is.null(info)) {
      info$repo <- repo
      return(info)
    }
  }

  stop(sprintf("couldn't find package '%s'", package))
}

So what devtools is doing (at least the version I currently have, 1.13.6, which is ancient) is to look for archive.rds and use it as a source of metainfo. Then, the calling routine (devtools::install_version) does the Archive/ dance to retrieve the package.

@achubaty
Copy link
Collaborator

Thanks for that @stefanoborini. I'll also add that the CRAN servers appear to be using Archive as a symlink(?) to 00Archive:

image

@stefanoborini
Copy link

I guess it's just a trick to have it as first entry to ease finding it.

@achubaty
Copy link
Collaborator

It makes sense to grab the metadata file to avoid recursing the contents of the directories to find packages. So that will be an additional detail to consider.

Yes, 00Archive is likely done to make it the first directory in /src/contrib/ so humans can find it easily.

@stefanoborini
Copy link

Note that the current version of devtools has moved the above code to the package remotes. The code is unchanged.

@meztez
Copy link

meztez commented Dec 15, 2022

This is something we do after running miniCRAN::addLocalPackage(basename(pkg), "..", "/var/www/minicran")

          pkgs <- readRDS("/var/www/minicran/src/contrib/PACKAGES.rds")
          pkgs <- paste0(pkgs[,1], "_", pkgs[,2], ".tar.gz")
          files <- dir(path = "/var/www/minicran/src/contrib", pattern = ".tar.gz")
          archives <- files[!files %in% pkgs]
          dir.create("/var/www/minicran/src/contrib/Archive/Meta", showWarnings = FALSE, recursive = TRUE)
          file.rename(paste0("/var/www/minicran/src/contrib/", archives),
                      paste0("/var/www/minicran/src/contrib/Archive/", archives))
          # Move archives in individual folders so renv can restore older versions
          f <- list.files("/var/www/minicran/src/contrib/Archive", pattern="tar.gz", include.dirs = FALSE)
          dir <- sort(unique(vapply(strsplit(f, "_"),`[`, character(1), 1)))
          for (d in dir) {
            dir.create(paste("/var/www/minicran/src/contrib/Archive",d,sep="/"), showWarnings=FALSE)
            files <- f[grepl(paste0("^",d,"_"),f)]
            file.rename(paste("/var/www/minicran/src/contrib/Archive",files,sep="/"), paste("/var/www/minicran/src/contrib/Archive",d,files,sep="/"))
          }
          # Create or update archive.rds
          wd <- getwd()
          setwd("/var/www/minicran/src/contrib/Archive")
          archive <- lapply(setNames(nm = list.files(".")), function(x) {file.info(list.files(x, recursive = T, full.names = TRUE))})
          saveRDS(archive, "../Meta/archive.rds")
          setwd(wd)

In case it helps someone dealing with an automated process using minicran.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants