Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with processing large set of coordinates with foreach() #60

Open
vrathi0 opened this issue Dec 11, 2020 · 2 comments
Open

Issues with processing large set of coordinates with foreach() #60

vrathi0 opened this issue Dec 11, 2020 · 2 comments

Comments

@vrathi0
Copy link

vrathi0 commented Dec 11, 2020

Hi Rich,

Thanks so much for such a useful tool.

I am having some problem in using the package with foreach()

Problem: Model run yields incomplete data, that is, a lot of data is being dropped during the model run time. I am mainly using the function hysplit_trajectory()

Possible Cause: I looked into the source code and it seems like the there are lot of files being copied/moved from one location to other and that might cause parallel processes to trip as one worker might be deleting the file even before other had a chance to read it.

For eg, in line 303 of hysplit_trajectory.R

unlink(file.path(exec_dir, trajectory_files), force = TRUE)

The above line is deleting some files and it is outside the "clean_up" toggle loop so the user have no way to control the delete. (ln 321-324 below)

 if (clean_up) {
    unlink(file.path(exec_dir, traj_output_files()), force = TRUE)
    unlink(recep_file_path_stack, recursive = TRUE, force = TRUE)
  }

Also here, can you just make the file in one location instead of moving them during runtime?

# Move files into the output folder
    file.copy(
      from = file.path(exec_dir, trajectory_files),
      to = recep_file_path,
      copy.mode = TRUE
    )

Can you comment on this. I might have a wrong diagnosis but the problem remains: using the hysplit_trajectory() with foreach() parallel loops causes worker processes to drop a lot of data, resulting in incomplete output.

Thanks so much,

@juliombarros
Copy link

Same issue here. Did you find a way out so far?

@juliombarros
Copy link

By the way, instead of using foreach we are using future_pmap and having the same issues: a lot of data is dropped and the output is incomplete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants