-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ambiguous logging when max_submit is 1 #7759
Comments
Suggestion what to log:
|
Apparently, when mimicking NFS syncing issues, we will not get any logs too: Copy from @eivindjahren message: If
You can reproduce it with fault injecting not writing the --- a/src/ert/enkf_main.py
+++ b/src/ert/enkf_main.py
@@ -231,7 +231,7 @@ def create_run_path(
run_context.iteration,
)
- json.dump(forward_model_output, fptr)
+ # json.dump(forward_model_output, fptr)
run_context.runpaths.write_runpath_list(
[run_context.iteration], run_context.active_realizations
class LegacyEnsemble(Ensemble):
@@ -226,7 +227,7 @@ async def _evaluate_inner( # pylint: disable=too-many-branches
self.min_required_realizations if self.stop_long_running else 0
)
- queue.add_dispatch_information_to_jobs_file()
+ # queue.add_dispatch_information_to_jobs_file()
result = await queue.execute(min_required_realizations)
except Exception: |
The logging might be already fixed by 50a4421. Need to just test. |
It did not fix it |
What we should do is to "find out" that the job does not run and get the lsf stdout into the logs. |
Currently even-though
MAX_SUBMIT
is set 1 we log failure withfailed after reaching max submit
. We should provide more detailed explanation injob.handle_failure
.Also
job._callback_status_msg
might be empty, which produces empty output:Definition of done:
In case that job fails, we should provide all the relevant information coming from the driver and the queue stdout and sterr files.
The text was updated successfully, but these errors were encountered: