Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix learn-ocaml build -j 2 issue #416

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

erikmd
Copy link
Member

@erikmd erikmd commented Sep 8, 2021

This PR is intended to Reproduce and Fix #414

…atch guard

Aim: mitigate a TOC/TOU race condition which occurs when
Learnocaml_process_exercise_repository.n_processes > 1,
triggered by a command such as "learn-ocaml build -j 2".

This isn't the only possible fix/workaround, just a concise one
relying on error handling.
Instead of tweaking Lwt_utils.mkdir_p, we add a call to this function
for the "www/static" directory before running the parallel build jobs.

MINOR-DRAWBACK: "www/static" is created even if there is no exercise!
@erikmd
Copy link
Member Author

erikmd commented Sep 8, 2021

Hi all,

I've tried 2 different fixes in this PR (tweak Lwt_utils.mkdir_p (in my reverted commit) or just calling Lwt_utils.mkdir_p once more).

Both fix the issue mentioned in issue #414: Avoid Cannot process exercises: Unix.Unix_error(Unix.EEXIST, "mkdir", "./www/static")

However, the issue is more pervasive than what I thought at first sight: despite this potential fix,
the parallel processing is still broken… because we get randomly, with the 2-exercises demo repositiory, one of these outcomes:

  1. OK
Learnocaml v.0.12 running.
Updating app at ./www
demo2                        [OK]
demo                         [OK]
  1. Partial build and freeze (the build process hangs):
Updating app at ./www
demo2                        [OK]
  1. something yet more strange:
Learnocaml v.0.12 running.
Updating app at ./www
Grader error: Lwt.Resolution_loop.Canceled
demo                       [FAILED]
demo2                        [OK]
demo                         [OK]

So, I'm going to postpone this for the time being − fixing this parallel build issue would be nice, but will require more investigation in src/repo/learnocaml_process_exercise_repository.ml

(FTR, the outputs above mention "0.12" but the tested version was obviously the commits from this PR;
anyway, I think "the ambient version number" should be bumped in master anytime soon, and after each release)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

learn-ocaml build -j 2 is broken
1 participant