Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Lsf driver to use dataclasses #7915

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jonathan-eq
Copy link
Contributor

@jonathan-eq jonathan-eq commented May 16, 2024

Issue
Resolves #7879

Approach
The commit in this PR refactors lsf driver from using pydantic classes to dataclasses, as pydantic was overkill for our usage.

(Screenshot of new behavior in GUI if applicable)

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure tests pass locally (after every commit!)

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Create Backport PR to latest release

@jonathan-eq jonathan-eq added improvement Something nice to have, that will make life easier for developers or users or both. release-notes:skip If there should be no mention of this in release notes labels May 16, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 16, 2024

Codecov Report

Attention: Patch coverage is 94.59459% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 86.00%. Comparing base (0eabeee) to head (a3780aa).
Report is 55 commits behind head on main.

Files Patch % Lines
src/ert/scheduler/lsf_driver.py 94.59% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7915      +/-   ##
==========================================
+ Coverage   85.82%   86.00%   +0.17%     
==========================================
  Files         378      382       +4     
  Lines       23062    23601     +539     
  Branches      621      628       +7     
==========================================
+ Hits        19794    20298     +504     
- Misses       3189     3229      +40     
+ Partials       79       74       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jonathan-eq jonathan-eq marked this pull request as ready for review May 16, 2024 10:00
@berland
Copy link
Contributor

berland commented May 21, 2024

This deviates from the way OpenPBS driver works. Should we keep them separate or also change OpenPBS to use dataclasses?

@jonathan-eq
Copy link
Contributor Author

This deviates from the way OpenPBS driver works. Should we keep them separate or also change OpenPBS to use dataclasses?

I think it would be best to do them separately to keep the PRs smaller.
I have created a separate issue for the OpenPBS driver (#7938)

@berland
Copy link
Contributor

berland commented May 21, 2024

This deviates from the way OpenPBS driver works. Should we keep them separate or also change OpenPBS to use dataclasses?

I think it would be best to do them separately to keep the PRs smaller. I have created a separate issue for the OpenPBS driver (#7938)

It is not as obvious for OpenPBS that it should change to dataclass as it is for LSF, since it is actually using a json parser code from pydantic. So it might not be correct to port OpenPBS to dataclass, and then the question remains if LSF should be kept to Pydantic for consistency or not. The weight of Pydantic might not matter unless we have it measured it to be problematic.

@jonathan-eq
Copy link
Contributor Author

This deviates from the way OpenPBS driver works. Should we keep them separate or also change OpenPBS to use dataclasses?

I think it would be best to do them separately to keep the PRs smaller. I have created a separate issue for the OpenPBS driver (#7938)

It is not as obvious for OpenPBS that it should change to dataclass as it is for LSF, since it is actually using a json parser code from pydantic. So it might not be correct to port OpenPBS to dataclass, and then the question remains if LSF should be kept to Pydantic for consistency or not. The weight of Pydantic might not matter unless we have it measured it to be problematic.

Yes, but is it not overkill to use Pydantic just to avoid implementing a from_dict class method in the OpenPBS driver? Would it be a bad idea to implement from_dict or from_json, and use the builtin json module instead?

@berland
Copy link
Contributor

berland commented May 21, 2024

Yes, but is it not overkill to use Pydantic just to avoid implementing a from_dict class method in the OpenPBS driver? Would it be a bad idea to implement from_dict or from_json, and use the builtin json module instead?

It is a tradeoff we must make a decision on.

@jonathan-eq
Copy link
Contributor Author

Yes, but is it not overkill to use Pydantic just to avoid implementing a from_dict class method in the OpenPBS driver? Would it be a bad idea to implement from_dict or from_json, and use the builtin json module instead?

It is a tradeoff we must make a decision on.

I say we should merge this one, and I will get going on OpenPBS #7938 🚀

This commit refactors lsf driver from using pydantic classes to dataclasses, as pydantic was overkill for our usage.
jobs: Mapping[str, AnyJob]
def _create_job_class(job_dict: Mapping[str, str]) -> AnyJob:
job_state = job_dict["job_state"]
if job_state in get_type_hints(FinishedJobSuccess)["job_state"].__args__:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This smells a little bit. Can we get away with just a dictionary from string job-state to the correct class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks way better but Mypy does not like that whatsoever.

src/ert/scheduler/lsf_driver.py:98: error: Argument 1 to "FinishedJobSuccess" has incompatible type "Literal['EXIT', 'DONE', 'PEND', 'RUN', 'ZOMBI', 'PDONE', 'SSUSP', 'USUSP', 'PSUSP', 'UNKWN']"; expected "Literal['DONE', 'PDONE']"  [arg-type]
src/ert/scheduler/lsf_driver.py:98: error: Argument 1 to "FinishedJobFailure" has incompatible type "Literal['EXIT', 'DONE', 'PEND', 'RUN', 'ZOMBI', 'PDONE', 'SSUSP', 'USUSP', 'PSUSP', 'UNKWN']"; expected "Literal['EXIT', 'ZOMBI']"  [arg-type]
src/ert/scheduler/lsf_driver.py:98: error: Argument 1 to "QueuedJob" has incompatible type "Literal['EXIT', 'DONE', 'PEND', 'RUN', 'ZOMBI', 'PDONE', 'SSUSP', 'USUSP', 'PSUSP', 'UNKWN']"; expected "Literal['PEND']"  [arg-type]
src/ert/scheduler/lsf_driver.py:98: error: Argument 1 to "RunningJob" has incompatible type "Literal['EXIT', 'DONE', 'PEND', 'RUN', 'ZOMBI', 'PDONE', 'SSUSP', 'USUSP', 'PSUSP', 'UNKWN']"; expected "Literal['RUN', 'SSUSP', 'USUSP', 'PSUSP']"  [arg-type]
src/ert/scheduler/lsf_driver.py:98: error: Argument 1 to "IgnoredJobstates" has incompatible type "Literal['EXIT', 'DONE', 'PEND', 'RUN', 'ZOMBI', 'PDONE', 'SSUSP', 'USUSP', 'PSUSP', 'UNKWN']"; expected "Literal['UNKWN']"  [arg-type]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Something nice to have, that will make life easier for developers or users or both. release-notes:skip If there should be no mention of this in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor object from Pydantic to dataclass
3 participants