Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

173 multithread submission processing #174

Merged
merged 26 commits into from May 2, 2024

Conversation

jcadam14
Copy link
Contributor

@jcadam14 jcadam14 commented Apr 23, 2024

Closes #173
Closes #181

  • Added the multithread_handler.py to handle creating the event loop, and monitoring the future from the process executor
  • Added the ProcessPoolExecutor stuff to the submission POST. I was not able to get this working in its own class/function for some reason, I had to add this stuff for the executor and background task directly to the endpoint. Otherwise, the execution task never fired.
  • Added pytests

@jcadam14 jcadam14 self-assigned this Apr 23, 2024
@jcadam14 jcadam14 linked an issue Apr 23, 2024 that may be closed by this pull request
Copy link

github-actions bot commented Apr 23, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/sbl_filing_api
  config.py
  src/sbl_filing_api/entities/engine
  engine.py
  src/sbl_filing_api/entities/repos
  submission_repo.py 107-110
  src/sbl_filing_api/routers
  filing.py
  src/sbl_filing_api/services
  multithread_handler.py 18-19
  submission_processor.py 127-135
Project Total  

This report was generated by python-coverage-comment-action

async def validate_and_update_submission(period_code: str, lei: str, submission: SubmissionDAO, content: bytes):
async def validate_and_update_submission(
period_code: str, lei: str, submission: SubmissionDAO, content: bytes, exec_check: dict
):
validator_version = imeta.version("regtech-data-validator")
submission.validation_ruleset_version = validator_version
submission.state = SubmissionState.VALIDATION_IN_PROGRESS
submission = await update_submission(submission)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't this one in the with context block?

Copy link
Contributor Author

@jcadam14 jcadam14 Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I was lazy? Updated. Removed the concept of the update function and the get_submission function from having a None session. Calling code will now be expected to handle that piece.

Updated the pytests now for all that, too

@@ -109,6 +110,13 @@ async def update_submission(submission: SubmissionDAO, incoming_session: AsyncSe
return await upsert_helper(session, submission, SubmissionDAO)


async def expire_submission(submission_id: int):
session = SessionLocal()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we use the with context here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, updated.

validator_version = imeta.version("regtech-data-validator")
submission.validation_ruleset_version = validator_version
submission.state = SubmissionState.VALIDATION_IN_PROGRESS
submission = await update_submission(submission)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need to pass the session into the argument?

@jcadam14
Copy link
Contributor Author

jcadam14 commented Apr 30, 2024

Man my brain isn't in it this week apparently. Instead of trying to be fancy, I decided to make the update_submission and get_submission param order just like every other function in the submission_repo, which is first the session, then the things you're getting/updating. I then changed all the places using these calls to just pass in the objects, again like every other call does. Updated all the pytests to follow the same idea.

Three other updates:

  • First, I noticed the use of the old validation_monitor function in a pytest and that function was still in the submission_processor even though it was never used. Wiped it all out.
  • Second, I updated poetry.lock (with poetry lock) to pull in the latest data-validator. I noticed that it didn't have my performance updates. It seems poetry.lock takes a snapshot of the latest commit on main, but then holds that reference id. It wasn't until I ran poetry lock -> poetry install that it pulled in the changes. It seems with the Dockerfile that because we're just doing 'poetry install', if the poetry.lock file isn't updated when git dependencies have new commits to main, the poetry install will use the older commit head. So we just need to make sure to poetry lock occasionally I guess? Unless I'm completely crazy here.
  • Finally (not session pun intended), I changed the ValueError that the ContactInfoDTO was throwing on failed regex validation to a RequestValidationError so that the exception handlers would catch it and return in the standard way we're expecting now. Not sure if this was correct way, but that was raising that exception up and causing it to not get to the client the way we expect exception cases to.

@lchen-2101
Copy link
Collaborator

Man my brain isn't in it this week apparently. Instead of trying to be fancy, I decided to make the update_submission and get_submission param order just like every other function in the submission_repo, which is first the session, then the things you're getting/updating. I then changed all the places using these calls to just pass in the objects, again like every other call does. Updated all the pytests to follow the same idea.

Three other updates:

  • First, I noticed the use of the old validation_monitor function in a pytest and that function was still in the submission_processor even though it was never used. Wiped it all out.
  • Second, I updated poetry.lock (with poetry lock) to pull in the latest data-validator. I noticed that it didn't have my performance updates. It seems poetry.lock takes a snapshot of the latest commit on main, but then holds that reference id. It wasn't until I ran poetry lock -> poetry install that it pulled in the changes. It seems with the Dockerfile that because we're just doing 'poetry install', if the poetry.lock file isn't updated when git dependencies have new commits to main, the poetry install will use the older commit head. So we just need to make sure to poetry lock occasionally I guess? Unless I'm completely crazy here.
  • Finally (not session pun intended), I changed the ValueError that the ContactInfoDTO was throwing on failed regex validation to a RequestValidationError so that the exception handlers would catch it and return in the standard way we're expecting now. Not sure if this was correct way, but that was raising that exception up and causing it to not get to the client the way we expect exception cases to.

for the 2nd bullet point, I've always just done poetry update {dependency name}, and it pulls in the latest from the repo (it actually is smart enough to see that repo has had no changes and won't update if so), and updates the lock file for me.

for the 3rd bullet point; keep it ValueError, I just tried it, and it gets thrown correctly as the RequestValidationError, which then gets correctly handled by the request validation error handler.

Copy link
Collaborator

@lchen-2101 lchen-2101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lchen-2101 lchen-2101 merged commit 0ddf51b into main May 2, 2024
3 checks passed
@lchen-2101 lchen-2101 deleted the 173-multithread-submission-processing branch May 2, 2024 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[WIP] Close session after operation Multithread submission processing
2 participants