Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: is coro thread safe? #44

Closed
rkrug opened this issue Mar 26, 2024 · 4 comments
Closed

Question: is coro thread safe? #44

rkrug opened this issue Mar 26, 2024 · 4 comments

Comments

@rkrug
Copy link

rkrug commented Mar 26, 2024

I have the following code, where the generator and the looping over the generated function is executed in parallel for different values.

now the lockfile (indicated in the code) is sometimes not deleted after the completion.

Therefore my question: is the usage of coro in a parallel processin=g environment safe, i.e. is cor thread safe?

Thanks,

Rainer

pbmcapply::pbmclapply(
            years,
            function(y) {

                    output_path <- file.path(SOME_BASEDIR, y).    ######### <<<<<------- EDIT HERE
                    oar <- openalexR::oa_generate(
                        ...
                    )

                # set <- vector("list", set_size)
                set <- NULL
                set_no <- 0

                file.create(file.path(output_path, "00_in_progress_00")). ## <- process lock file
                coro::loop(
                    for (x in oar) {
                        set <- c(set, list(x))
                        if ((length(set) >= set_size) | isTRUE(x == coro::exhausted())) {
                            saveRDS(set, file.path(output_path, paste0("set_", set_no, ".rds")))
                            # set <- vector("list", set_size) # reset recs
                            set <- list()
                            set_no <- set_no + 1
                        }
                    }
                )
                ### and save the last one
                saveRDS(set, file.path(output_path, paste0("set_", set_no, ".rds")))
                file.create(file.path(output_path, "00_complete_00"))
                file.rename(
                    file.path(output_path, "00_in_progress_00"),
                    file.path(output_path, "00_complete_00")
                )
            },
            mc.cores = mc_cores,
            mc.preschedule = FALSE
        )
@lionel-
Copy link
Member

lionel- commented Mar 26, 2024

From a cursory look, if pbmclapply() runs the callback in parallel, it seems to me you have race conditions involved in the creation and deletion of the lock file? I mean in the parts that don't involve coro at all.

(of course coro is threadsafe but I'm not sure what you mean by this)

@rkrug
Copy link
Author

rkrug commented Mar 26, 2024

Sorry - one point is missing in the example, i.e. that output_path is different for each y (see EDIT HERE in the edited example). If there is still a race condition, I am missing it.

@lionel-
Copy link
Member

lionel- commented Mar 26, 2024

I'm not sure how this code works or what it tries to achieve and I can't help you without an actual reprex. (Even with one I'm a bit swamped currently.) But I don't think coro is the culprit here.

@rkrug
Copy link
Author

rkrug commented Mar 26, 2024

OK - thanks. I will look at it and come back to you when needed.

@rkrug rkrug closed this as completed Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants