Idea: serial transformation representation #18

vreuter · 2017-09-02T03:03:49Z

Maybe sort of pie-in-the-sky, but I think It'd be cool to be able to provide a function that takes a list of instructions and interprets it as a sequential series of transformations from one cache to another, like Spark does with RDDs. The function could work backward from the end of the list, determining the filepath for the cache to be created or loaded and seeing if it exists. recreate would be used as-is but would only apply to the "leaf" cache, list elements would just be instructions, not entire bundles of simpleCache arguments. The function would backtrack until it hit an existing cache, then sequentially execute the instructions from there, generating the intermediate cache(s). I've found myself wanting to reuse caches between scripts, which forces a tradeoff. Either duplicate the code used to create it, or invoke loadCaches and lose the create-if-needed benefit of simpleCache.

The text was updated successfully, but these errors were encountered:

nsheff · 2017-09-02T11:35:49Z

it's not exactly the same thing, but did you see the buildDir option?

simpleCache/R/simpleCache.R

Lines 26 to 33 in 8ff4d09

    
           #' You should pass a bracketed R code snippet like `{ rnorm(500) }` as the 
        
           #' instruction, and simpleCache will create the object. Alternatively, if the 
        
           #' code to create the cache is large, you can put an R script called object.R in 
        
           #' the RBUILD.DIR (the name of the file *must* match the name of the object it 
        
           #' creates *exactly*). If you don't provide an instruction, the function sources 
        
           #' RBUILD.DIR/object.R and caches the result as the object. This source file 
        
           #' *must* create an object with the same name of the object. If you already have

but practically what I do in this situation is order the scripts and then just provide the instruction in the main script. I think this is simpler than using the buildDir, which I don't use anymore. Then I just use loadCaches on later scripts.

So I have cache generating scripts and then cache using scripts, for some things, and that seems to work OK.

But I think the order of transformations on a cache is interesting but it seems like a different issue than avoiding duplicating cache creation code.

vreuter mentioned this issue Sep 2, 2017

Cache function script databio/projectInit#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: serial transformation representation #18

Idea: serial transformation representation #18

vreuter commented Sep 2, 2017

nsheff commented Sep 2, 2017

Idea: serial transformation representation #18

Idea: serial transformation representation #18

Comments

vreuter commented Sep 2, 2017

nsheff commented Sep 2, 2017