-
Notifications
You must be signed in to change notification settings - Fork 26
/
hpc.Rmd
535 lines (400 loc) · 23.8 KB
/
hpc.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
# High-performance computing {#hpc}
```{r, message = FALSE, warning = FALSE, echo = FALSE}
knitr::opts_knit$set(root.dir = fs::dir_create(tempfile()))
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
options(
drake_make_menu = FALSE,
drake_clean_menu = FALSE,
warnPartialMatchArgs = FALSE,
crayon.enabled = FALSE,
readr.show_progress = FALSE,
tidyverse.quiet = TRUE
)
```
```{r, message = FALSE, warning = FALSE, echo = FALSE}
library(future)
library(drake)
library(dplyr)
```
This chapter provides guidance on time-consuming `drake` workflows and high-level parallel computation.
## Start small
Before you jump into high-performance computing with a large workflow, consider running a downsized version to debug and test things first. That way, you can avoid consuming lots of computing resources until you are reasonably sure everything works. Create a test plan with `drake_plan(max_expand = SMALL_NUMBER)` before scaling up to the full set of targets, and take temporary shortcuts in your commands so your targets build more quickly for test mode. See [this section on plans](https://books.ropensci.org/drake/static.html#start-small) for details.
## Let `make()` schedule your targets.
When it comes time to activate high-performance computing, `drake` launches its own parallel workers and sends targets to those workers. The workers can be local processes or jobs on a cluster. `drake` uses your project's implicit dependency graph to figure out which targets can run in parallel and which ones need to wait for dependencies.
```{r}
load_mtcars_example() # from https://github.com/wlandau/drake-examples/tree/main/mtcars
vis_drake_graph(my_plan)
```
You do not need to not micromanage how targets are scheduled, and you do not need to run simultaneous instances of `make()`.
## The main process
`make()` takes care of the jobs it launches, but `make()` *itself* is a job too, and it is your responsibility to manage it.
### Main process on a cluster
Most clusters will let you submit `make()` as a job on a compute node. Let's consider the Sun Grid Engine (SGE) as an example. First, we create a script that calls `make()` (or [`r_make()`](https://docs.ropensci.org/drake/reference/r_make.html)).
```{r, eval = FALSE}
# make.R
source("R/packages.R")
source("R/packages.R")
source("R/packages.R")
options(
clustermq.scheduler = "sge",
# Created by drake_hpc_template_file("sge_clustermq.tmpl"):
clustermq.template = "sge_clustermq.tmpl"
)
make(
plan,
parallelism = "clustermq",
jobs = 8,
console_log_file = "drake.log"
)
```
Then, we create a shell script (say, `run.sh`) to call `make.R`. This script may look different if you use a different scheduler such as [SLURM](https://slurm.schedmd.com).
```{bash, eval = FALSE}
# run.sh
#!/bin/bash
#$ -j y # combine stdout/error in one file
#$ -o log.out # output file
#$ -cwd # use pwd as work dir
#$ -V # use environment variable
module load R # Uncomment if R is an environment module.
R --no-save CMD BATCH make.R
```
Finally, to run the whole workflow, we call `qsub`.
```{bash, eval = FALSE}
qsub run.sh
```
And here is what happens:
1. A new job starts on the cluster with the configuration flags next to `#$` in `run.sh`.
2. `run.sh` opens R and runs `make.R`.
3. `make.R` invokes `drake` using the `make()` function.
4. `make()` launches 8 new jobs on the cluster
So 9 simultaneous jobs run on the cluster and we avoid bothering the headnode / login node.
### Local main process
Alternatively, you can run `make()` in a persistent background process. The following should work in the Mac/Linux terminal/shell.
<pre><code>nohup nice -19 R --no-save CMD BATCH make.R &
</code></pre>
where:
- `nohup`: Keep the job running even if you log out of the machine.
- `nice -19`: This is a low-priority job that should not consume many resources. Other processes should take priority.
- `R CMD BATCH`: Run the R script in a fresh new R session.
- `--no-save`: do not save the workspace in a `.RData` file.
- `&`: Run this job in the background so you can do other stuff in the terminal window.
Alternatives to `nohup` include [`screen`](https://linuxize.com/post/how-to-use-linux-screen/) and [Byobu](http://byobu.co/).
## Parallel backends
Choose the parallel backend with the `parallelism` argument and set the `jobs` argument to scale the work appropriately.
```{r, eval = FALSE}
make(my_plan, parallelism = "future", jobs = 2)
```
The two primary backends with long term support are [`clustermq`](https://github.com/mschubert/clustermq) and [`future`](https://github.com/HenrikBengtsson/future). If you can install [ZeroMQ](http://zeromq.org), the best choice is usually [`clustermq`](https://github.com/mschubert/clustermq). (It is faster than [`future`](https://github.com/HenrikBengtsson/future).) However, [`future`](https://github.com/HenrikBengtsson/future) is more accessible: it does not require [ZeroMQ](http://zeromq.org), it supports parallel computing on Windows, it can work with more restrictive wall time limits on clusters, and it [can deploy targets to Docker images](https://github.com/wlandau/drake-examples/tree/main/Docker-psock) (`drake_example("Docker-psock")`).
## The `clustermq` backend
### Persistent workers
The `make(parallelism = "clustermq", jobs = 2)` launches 2 parallel *persistent workers*. The main process assigns targets to workers, and the workers simultaneously traverse the dependency graph.
<script src="https://fast.wistia.com/embed/medias/ycczhxwkjw.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.21% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_ycczhxwkjw videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/ycczhxwkjw/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" onload="this.parentNode.style.opacity=1;" /></div></div></div></div>
### Installation
Persistent workers require the [`clustermq`](https://github.com/mschubert/clustermq) R package, which in turn requires [ZeroMQ](http://zeromq.org/). Please refer to the [`clustermq` installation guide](https://github.com/mschubert/clustermq/blob/master/README.md#installation) for specific instructions.
### On your local machine
To run your targets in parallel over the cores of your local machine, set the global option below and run `make()`.
```{r, eval = FALSE}
options(clustermq.scheduler = "multicore")
make(plan, parallelism = "clustermq", jobs = 2)
```
### On a cluster
Set the [`clustermq`](https://github.com/mschubert/clustermq) global options to register your computing resources. For [SLURM](https://slurm.schedmd.com/slurmd.html):
```{r, eval = FALSE}
options(clustermq.scheduler = "slurm", clustermq.template = "slurm_clustermq.tmpl")
```
Here, `slurm_clustermq.tmpl` is a [template file](https://github.com/ropensci/drake/tree/main/inst/templates/hpc) with configuration details. Use `drake_hpc_template_file()` to write one of the available examples.
```{r, eval = FALSE}
drake_hpc_template_file("slurm_clustermq.tmpl") # Write the file slurm_clustermq.tmpl.
```
After modifying `slurm_clustermq.tmpl` by hand to meet your needs, call `make()` as usual.
```{r, eval = FALSE}
make(plan, parallelism = "clustermq", jobs = 4)
```
## The `future` backend
### Transient workers
`make(parallelism = "future", jobs = 2)` launches *transient workers* to build your targets. When a target is ready to build, the main process creates a fresh worker to build it, and the worker terminates when the target is done. `jobs = 2` means that at most 2 transient workers are allowed to run at a given time.
<script src="https://fast.wistia.com/embed/medias/340yvlp515.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.21% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_340yvlp515 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/340yvlp515/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><br>
### Installation
Install the [`future`](https://github.com/HenrikBengtsson/future) package.
```{r, eval = FALSE}
install.packages("future") # CRAN release
# Alternatively, install the GitHub development version.
devtools::install_github("HenrikBengtsson/future", ref = "develop")
```
If you intend to use a cluster, be sure to install the [`future.batchtools`](https://github.com/HenrikBengtsson/future.batchtools) package too. The [`future`](https://github.com/HenrikBengtsson/future) ecosystem contains even more packages that extend [`future`](https://github.com/HenrikBengtsson/future)'s parallel computing functionality, such as [`future.callr`](https://github.com/HenrikBengtsson/future.callr).
### On your local machine
First, select a [`future`](https://github.com/HenrikBengtsson/future) plan to tell [`future`](https://github.com/HenrikBengtsson/future) how to create the workers. See [this table](https://github.com/HenrikBengtsson/future#controlling-how-futures-are-resolved) for descriptions of the core options.
```{r, eval = FALSE}
future::plan(future::multiprocess)
```
Next, run `make()`.
```{r, eval = FALSE}
make(plan, parallelism = "future", jobs = 2)
```
### On a cluster
Install the [`future.batchtools`](https://github.com/HenrikBengtsson/future.batchtools) package and use [this list](https://github.com/HenrikBengtsson/future.batchtools#choosing-batchtools-backend) to select a [`future`](https://github.com/HenrikBengtsson/future) plan that matches your resources. You will also need a compatible [template file](https://github.com/mllg/batchtools/tree/master/inst/templates) with configuration details. As with [`clustermq`](https://github.com/mschubert/clustermq), `drake` can generate some examples:
```{r, eval = FALSE}
drake_hpc_template_file("slurm_batchtools.tmpl") # Edit by hand.
```
Next, register the template file with a plan.
```{r, eval = FALSE}
library(future.batchtools)
future::plan(batchtools_slurm, template = "slurm_batchtools.tmpl")
```
Finally, run `make()`.
```{r, eval = FALSE}
make(plan, parallelism = "future", jobs = 2)
```
## Advanced options
### Selectivity
Some targets build so quickly that it is not worth sending them to parallel workers. To run these targets locally in the main process, define a special `hpc` column of your `drake` plan. Below, `NA` and `TRUE` are treated the same, and `make(plan, parallelism = "clustermq")` only sends `model_1` and `model_2` to parallel workers.
```{r}
drake_plan(
model = target(
crazy_long_computation(index),
transform = map(index = c(1, 2))
),
accuracy = target(
summarize_accuracy(model),
transform = combine(model),
hpc = FALSE
),
specificity = target(
summarize_specificity(model),
transform = combine(model),
hpc = FALSE
),
report = target(
render(knitr_in("results.Rmd"), output_file = file_out("results.html")),
hpc = FALSE
)
)
```
### Memory options
By default, `make()` keeps targets in memory during runtime. Some targets are dependencies of other targets downstream, while others may be no longer actually need to be in memory. The `memory_strategy` argument to `make()` allows you to choose the tradeoff that best suits your project. Options:
- `"speed"`: Once a target is loaded in memory, just keep it there. This choice maximizes speed and hogs memory.
- `"memory"`: Just before building each new target, unload everything from memory except the target's direct dependencies. This option conserves memory, but it sacrifices speed because each new target needs to reload any previously unloaded targets from storage.
- `"lookahead"`: Just before building each new target, search the dependency graph to find targets that will not be needed for the rest of the current `make()` session. In this mode, targets are only in memory if they need to be loaded, and we avoid superfluous reads from the cache. However, searching the graph takes time, and it could even double the computational overhead for large projects.
### Storage options
In `make(caching = "main")`, the workers send the targets to the main process, and the main process stores them one by one in the cache. `caching = "main"` is compatible with all [`storr`](https://github.com/richfitz/storr) cache formats, including the more esoteric ones like `storr_dbi()` and `storr_environment()`.
In `make(caching = "worker")`, the parallel workers are responsible for writing the targets to the cache. Some output-heavy projects can benefit from this form of parallelism. However, it can sometimes add slowness on clusters due to lag from network file systems. And there are additional restrictions:
- All the workers must have the same file system and the same working directory as the main process.
- Only the default `storr_rds()` cache may be used. Other formats like `storr_dbi()` and `storr_environment()` cannot accommodate parallel cache operations.
See the [storage chapter](#storage) for details.
### The `template` argument for persistent workers
For more control and flexibility in the [`clustermq`](https://github.com/mschubert/clustermq) backend, you can parameterize your template file and use the `template` argument of `make()`. For example, suppose you want to programatically set the number of "slots" (basically cores) per job on an [SGE system](http://gridscheduler.sourceforge.net/htmlman/manuals.html) (`clustermq` guide to SGE setup [here](https://github.com/mschubert/clustermq/wiki/SGE)). Begin with a parameterized template file `sge_clustermq.tmpl` with a custom `n_slots` placeholder.
```
# File: sge_clustermq.tmpl
# Modified from https://github.com/mschubert/clustermq/wiki/SGE
#$ -N {{ job_name }} # job name
#$ -t 1-{{ n_jobs }} # submit jobs as array
#$ -j y # combine stdout/error in one file
#$ -o {{ log_file | /dev/null }} # output file
#$ -cwd # use pwd as work dir
#$ -V # use environment variable
#$ -pe smp {{ n_slots | 1 }} # request n_slots cores per job
module load R
ulimit -v $(( 1024 * {{ memory | 4096 }} ))
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
```
Then when you run `make()`, use the `template` argument to set `n_slots`.
```{r, eval = FALSE}
options(clustermq.scheduler = "sge", clustermq.template = "sge_clustermq.tmpl")
library(drake)
load_mtcars_example()
make(
my_plan,
parallelism = "clustermq",
jobs = 16,
template = list(n_slots = 4) # Request 4 cores per persistent worker.
)
```
Custom placeholders like `n_slots` are processed with the [`infuser`](https://github.com/Bart6114/infuser) package.
### The `resources` column for transient workers
Different targets may need different resources. For example,
```{r}
plan <- drake_plan(
data = download_data(),
model = big_machine_learning_model(data)
)
```
The `model` needs a GPU and multiple CPU cores, and the `data` only needs the bare minimum resources. Declare these requirements with `target()`, as below. This is equivalent to adding a new list column to the `plan`, where each element is a named list for the `resources` argument of `future::future()`.
```{r}
plan <- drake_plan(
data = target(
download_data(),
resources = list(cores = 1, gpus = 0)
),
model = target(
big_machine_learning_model(data),
resources = list(cores = 4, gpus = 1)
)
)
plan
str(plan$resources)
```
Next, plug the names of your resources into the [`brew`](https://CRAN.R-project.org/package=brew) patterns of your [`batchtools`](https://github.com/mllg/batchtools) template file. The following `sge_batchtools.tmpl` file shows how to do it, but the file itself probably requires modification before it will work with your own machine.
```
#!/bin/bash
#$ -cwd
#$ -j y
#$ -o <%= log.file %>
#$ -V
#$ -N <%= job.name %>
#$ -pe smp <%= resources[["cores"]] %> # CPU cores
#$ -l gpu=<%= resources[["gpus"]] %> # GPUs.
Rscript -e 'batchtools::doJobCollection("<%= uri %>")'
exit 0
```
Finally, register the template file and run your project.
```{r, eval = FALSE}
library(drake)
library(future.batchtools)
future::plan(batchtools_sge, template = "sge_batchtools.tmpl")
make(plan, parallelism = "future", jobs = 2)
```
### Parallel computing *within* targets
To recruit parallel processes within individual targets, we recommend the [`future.callr`](https://github.com/HenrikBengtsson/future.callr) and [`furrr`](https://github.com/DavisVaughan/furrr) packages. Usage details depend on the parallel backend you choose for `make()`. If you must write custom code with `mclapply()`, please read the subsection below on locked bindings/environments.
#### Locally
Use [`future.callr`](https://github.com/HenrikBengtsson/future.callr) and [`furrr`](https://github.com/DavisVaughan/furrr) normally.
```{r, eval = FALSE}
library(drake)
# The targets just collect the process IDs of the callr processes.
plan <- drake_plan(
x = furrr::future_map_int(1:2, function(x) Sys.getpid()),
y = furrr::future_map_int(1:2, function(x) Sys.getpid())
)
# Tell the drake targets to fork up to 4 callr processes.
future::plan(future.callr::callr)
# Build the targets.
make(plan)
# Process IDs of the local workers of x:
readd(x)
```
If you happen to be using `r_make()` and want to use parallelism _within_ targets, `future::plan(future.callr::callr)` needs to go inside `_drake.R` before the call to `drake_config()`. In that case your `_drake.R` file would look like this:
```{r, eval = FALSE}
plan <- drake_plan(
x = furrr::future_map_int(1:2, function(x) Sys.getpid()),
y = furrr::future_map_int(1:2, function(x) Sys.getpid())
)
future::plan(future.callr::callr)
drake_config(plan)
```
This will allow you to call `r_make` like normal. Configuring your `_drake.R` file like this is necessary because `r_make()` launches an external R process that runs `_drake.R` followed by `make()`. See [this section](https://books.ropensci.org/drake/projects.html#safer-interactivity) for more information on `r_make()`.
#### Persistent workers
Each persistent worker needs its own [`future::plan()`](https://github.com/HenrikBengtsson/future), which we set with the `prework` argument of `make()`. The following example uses [SGE](http://gridscheduler.sourceforge.net/htmlman/manuals.html). To learn about templates for other clusters, please consult the [`clustermq`](https://github.com/mschubert/clustermq) documentation.
```{r, eval = FALSE}
library(drake)
# The targets just collect the process IDs of the callr processes.
plan <- drake_plan(
x = furrr::future_map_int(1:2, function(x) Sys.getpid()),
y = furrr::future_map_int(1:2, function(x) Sys.getpid())
)
# Write a template file for clustermq.
writeLines(
c(
"#!/bin/bash",
"#$ -N {{ job_name }} # job name",
"#$ -t 1-{{ n_jobs }} # submit jobs as array",
"#$ -j y # combine stdout/error in one file",
"#$ -o {{ log_file | /dev/null }} # output file",
"#$ -cwd # use pwd as work dir",
"#$ -V # use environment variables",
"#$ -pe smp 4 # request 4 cores per job",
"module load R-qualified/3.5.2 # if loading R from an environment module",
"ulimit -v $(( 1024 * {{ memory | 4096 }} ))",
"CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"
),
"sge_clustermq.tmpl"
)
# Register the scheduler and template file with clustermq.
options(
clustermq.scheduler = "sge",
clustermq.template = "sge_clustermq.tmpl"
)
# Build the targets.
make(
plan,
parallelism = "clustermq",
jobs = 2,
# Each of the two workers can spawn up to 4 local processes.
prework = quote(future::plan(future.callr::callr))
)
# Process IDs of the local workers of x:
readd(x)
```
#### Transient workers
As explained in the [`future` vignette](https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html), we can nest our `future::plans()`. Each target gets its own remote job, and each job can spawn up to 4 local `callr` processes. The following example uses [SGE](http://gridscheduler.sourceforge.net/htmlman/manuals.html). To learn about templates for other clusters, please consult the [`future.batchtools`](https://github.com/HenrikBengtsson/future.batchtools) documentation.
```{r, eval = FALSE}
library(drake)
# The targets just collect the process IDs of the callr processes.
plan <- drake_plan(
x = furrr::future_map_int(1:2, function(x) Sys.getpid()),
y = furrr::future_map_int(1:2, function(x) Sys.getpid())
)
# Write a template file for future.batchtools.
writeLines(
c(
"#!/bin/bash",
"#$ -cwd # use pwd as work dir",
"#$ -j y # combine stdout/error in one file",
"#$ -o <%= log.file %> # output file",
"#$ -V # use environment variables",
"#$ -N <%= job.name %> # job name",
"#$ -pe smp 4 # 4 cores per job",
"module load R # if loading R from an environment module",
"Rscript -e 'batchtools::doJobCollection(\"<%= uri %>\")'",
"exit 0"
),
"sge_batchtools.tmpl"
)
# In our nested plans, each target gets its own remote SGE job,
# and each worker can spawn up to 4 `callr` processes.
future::plan(
list(
future::tweak(
future.batchtools::batchtools_sge,
template = "sge_batchtools.tmpl"
),
future.callr::callr
)
)
# Build the targets.
make(plan, parallelism = "future", jobs = 2)
# Process IDs of the local workers of x:
readd(x)
```
#### Number of local workers per target
By default, `future::availableCores()` determines the number of local [`callr`](https://github.com/r-lib/callr) workers. To better manage resources, you may wish to further restrict the number of [`callr`](https://github.com/r-lib/callr) workers for all targets in the plan, e.g. `future::plan(future::callr, workers = 4L)` or:
```{r, eval = FALSE}
future::plan(
list(
future::tweak(
future.batchtools::batchtools_sge,
template = "sge_batchtools.tmpl"
),
future::tweak(future.callr::callr, workers = 4L)
)
)
```
Alternatively, you can use chunking to prevent individual targets from using too many workers, e.g. `furrr::future_map(.options = furrr::future_options(scheduling = 4))`. Here, the `scheduling` argument sets the average number of futures per worker.
#### Locked binding/environment errors
Some workflows unavoidably use `mclapply()`, which is known to modify the global environment against `drake`'s will. If you are stuck, there are two workarounds.
1. Use `make(lock_envir = FALSE)`.
2. Use the `envir` argument of `make()`. That way, `drake` locks your special custom environment instead of the global environment.
```{r, eval = FALSE}
# Load the main example: https://github.com/wlandau/drake-examples
library(drake)
drake_example("main")
setwd("main")
# Define and populate a special custom environment.
envir <- new.env(parent = globalenv())
source("R/packages.R", local = envir)
source("R/functions.R", local = envir)
source("R/plan.R", local = envir)
# Check the contents of your environments.
ls(envir) # Should have your functions and plan
ls() # The global environment should only have what you started with.
# Build the targets using your custom environment
make(envir$plan, envir = envir)
```