-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation schema design #7
Comments
I started to look at PR #10, but I thought I would take a couple steps back and try to get my thoughts down on the overall schema for this database, not just the submission portion. Hopefully it can be a conversation starter, and we can do a few iterations on this, and it'll help us map out the remaining work and the order we want to take it on. I also took a stab at what the filing and submissions state machines might look like. Hope this is useful. Let's sync up on all this soon. Probably early next week. Also, this is my first time playing with the inline Mermaid diagrams. It's not my favorite style-wise, but it's hard to beat having it just work right in a GitHub comment. ✨ ---
title: Filing API Database Schema
---
erDiagram
filing_type {
varchar id PK
varchar name UK
}
filing_period {
varchar id PK
varchar name UK
date start_date
date end_date
date due_date
varchar filing_type_id FK
}
filing_period }|--|| filing_type : ""
filing_state {
varchar id PK
varchar name UK
}
filing {
integer id PK "auto-increment"
varchar lei "pseudo FK to `institution` table"
varchar filing_period_id PK,FK
varchar state FK
varchar institution_snaphot_id "This needs some thought"
}
filing }|--|| filing_period : ""
filing }|--|| filing_state : ""
submission_state {
varchar id PK
varchar name UK
}
submission {
integer id PK
integer filing_id FK
integer attempt "Q: Do we need this AND `time`?"
timestamp time
varchar user_id "Login.gov user id"
varchar state FK
varchar validation_ruleset_version "Useful if we update validation logic mid filing season"
jsonb findings "JSONB alternative to additional `*_findings` tables"
}
submission }|--|| filing : ""
submission }|--|| submission_state : ""
submission_finding {
integer submission_id FK
varchar validation_code "Q: Do we need additional validation metadata?"
}
submission_finding }|--|| submission : ""
submission_finding_record {
integer record_number
varchar field_name
varchar field_value
}
submission_finding_record }|--|| submission_finding : ""
---
title: Filing State Machine
---
stateDiagram
[*] --> filing_started
filing_started --> institution_data_set
institution_data_set --> submissions_in_progress
submissions_in_progress --> filing_complete
filing_complete --> submissions_in_progress
filing_complete --> must_refile
must_refile --> submissions_in_progress
filing_complete --> [*]
---
title: Submission State Machine
---
stateDiagram
classDef failure fill:#f00,color:white
classDef error fill:#ff0
[*] --> submission_started
submission_started --> uploading
uploading --> upload_failed:::failure
uploading --> uploaded
uploaded --> validating
validating --> system_failure:::failure
validating --> invalid_file:::error
validating --> validated_with_errors:::error
validating --> validated_with_warnings
validating --> validated_clean
validated_with_warnings --> signed
validated_clean --> signed
signed --> [*]
|
Wow, mermaid is cool. I used to just throw stuff together in Visio but wasn't sure how somewhat high level designs were done here so figured I'd let the code be the design. Like the mermaid thing! Read through comments and diagram. I'm putting my thoughts down here so I don't forget them later.
I'm scheduling a chat for Monday if that works. |
Maybe #6 should've been worked on first. That would open up the big picture of how the flows would look like; original thought of this ticket was only about storing the validation results. I think the filing state and things aside from the validation should be worked out there. I've linked the API flow from HMDA in that ticket: https://ffiec.cfpb.gov/documentation/api/filing/platform#endpoints, shares similarities with the state machine mentioned here aside from the confirm institution data part not being present on the HMDA end. |
---
title: Filing API Database Classes
---
classDiagram
class `FilingDAO(AuditMixin, Base):`{
__tablename__ = "filing"
id: Mapped[int] = mapped_column#40;primary_key=True, autoincrement=True#41;
lei: Mapped[str]
state: Mapped[FilingState] = mapped_column#40;Enum#40;FilingState#41;#41;
filing_period = Mapped[str] = mapped_column(ForeignKey#40;"filing_period.id"#41;#41;
institution_snapshot_id = Mapped[str] # not sure what this is
}
class `FilingPeriodDAO(AuditMixin, Base):`{
__tablename__ = "filing_period"
id: Mapped[int] = mapped_column#40;primary_key=True, autoincrement=True#41;
name: Mapped[str]
start: Mapped[datetime]
end: Mapped[datetime]
due: Mapped[datetime]
filing_type: Mapped[FilingType] = mapped_column#40;Enum#40;FilingType#41;#41;
}
class `SubmissionDAO(AuditMixin, Base):`{
__tablename__ = "submission"
id: Mapped[str] = mapped_column#40;index=True, primary_key=True#41;
submitter: Mapped[str] # user_id instead?
results: Mapped[List["FindingDAO"]] = relationship#40;back_populates="submission"#41;
state: Mapped[SubmissionState] = mapped_column#40;Enum#40;SubmissionState#41;#41;
validation_ruleset_version: Mapped[str]
json_dump: Mapped[dict[str, Any]] = mapped_column#40;JSON, nullable=True#41;
filing: Mapped[str] = mapped_column#40;ForeignKey#40;"filing.id"#41;#41;
}
class `FindingDAO(AuditMixin, Base):`{
__tablename__ = "submission_finding"
id: Mapped[int] = mapped_column#40;primary_key=True, autoincrement=True#41;
submission_id: Mapped[str] = mapped_column(ForeignKey#40;"submission.id"#41;#41;
submission: Mapped["SubmissionDAO"] = relationship#40;back_populates="results"#41; # if we care about bidirectional
validation_code: Mapped[str]
severity: Mapped[Severity] = mapped_column#40;Enum#40;*get_args#40;Severity#41;#41;#41;
records: Mapped[List["RecordDAO"]] = relationship#40;back_populates="result"#41;
}
note for `FindingDAO(AuditMixin, Base):` "Severity = Literal['error', 'warning']"
class `RecordDAO(AuditMixin, Base):`{
__tablename__ = "submission_finding_record"
id: Mapped[int] = mapped_column#40;primary_key=True, autoincrement=True#41;
result_id: Mapped[str] = mapped_column#40;ForeignKey#40;"submission_finding.id"#41;#41;
result: Mapped["FindingDAO"] = relationship#40;back_populates="records"#41; # if we care about bidirectional
record: Mapped[int]
field_name: Mapped[str]
field_value: Mapped[str]
}
`FilingPeriodDAO(AuditMixin, Base):` *-- `FilingDAO(AuditMixin, Base):`
`FilingDAO(AuditMixin, Base):` *-- `SubmissionDAO(AuditMixin, Base):`
`SubmissionDAO(AuditMixin, Base):` *-- `FindingDAO(AuditMixin, Base):`
`FindingDAO(AuditMixin, Base):` *-- `RecordDAO(AuditMixin, Base):`
---
title: State Enums
---
classDiagram
note for `SubmissionState(Enum):` "SUBMISSION_UPLOAD_FAILED seems to only be
applicable if the storage to the S3 fails.
Otherwise, if there is a failure to send the
file to the initial upload endpoint, the endpoint
wouldn't know about it. Does not being able to
store to S3 indicate an overall processing failure?"
class `SubmissionState(Enum):`{
SUBMISSION_UPLOADED = 1
SUBMISSION_UPLOAD_FAILED = 2
SUBMISSION_WITH_ERRORS = 3
SUBMISSION_WITH_WARNINGS = 4
SUBMISSION_SUCCESSFUL = 5
SUBMISSION_SIGNED = 6
}
class `FilingState(Enum):`{
FILING_STARTED = 1
FILING_IN_PROGRESS = 2
FILING_COMPLETE = 3
}
class `FilingType(Enum):`{
TYPE_A = "Type_A"
TYPE_B = "Type_B"
}
---
title: Filing State Machine
---
stateDiagram
[*] --> filing_started
filing_started --> filing_in_progress
filing_in_progress --> filing_in_progress
filing_in_progress --> filing_complete
filing_complete --> [*]
---
title: Submission State Machine
---
stateDiagram
classDef failure fill:#f00,color:white
classDef warning fill:#ff0
[*] --> submission_uploaded
submission_uploaded --> submission_upload_failed:::failure
submission_uploaded --> submission_with_errors:::failure
submission_uploaded --> submission_with_warnings:::warning
submission_uploaded --> submission_successful
submission_with_warnings --> submission_signed
submission_successful --> submission_signed
submission_signed --> [*]
|
…rom #7) Merged in branch 7-validation-schema-design-wip to support latest DAOs Updated alembic scripts for new tables Updated tests to test alembic migration, and also updated the repo test since I changed the FilingType enum to be MANUAL instead of TYPE_A and TYPE_B
Added pytests to test the update_submission fuction. Please note the comment on the _generator_function in test_submission_repo.py Needed to merge in parts from story #7
Closes #13 Created a new branch that focuses purely on the alembic script needs instead of a full merge of #7 and #18 into the #13 branch. Makes it so only the code related to the migrations is needed for review. Originally I merged them all together to test starting the service in poetry to make sure the tables are created correctly in Postgres but that made the PR have a lot of files that were already covered in other PRs. Still the case with the entities/models directory but couldn't get around that. ## Additions - Added db_revision folder using alembic init and updated files for SBL specifics (using user-fi as an example) - Added migration files using alembic revision for each submission table (submission, filing, filing_period) - Updated migration files for table specifics - Added pytest migrations tests for new tables, columns, and fks - Had to merge in the entities/models dir since the migrations use the enumerations ## Removals - ## Changes - ## Testing 1. Run pytest in poetry shell (or poetry run pytest) and verify the tests pass ## Screenshots ## Notes - ## Todos - ## Checklist - [ ] PR has an informative and human-readable title - [ ] Changes are limited to a single goal (no scope creep) - [ ] Code can be automatically merged (no conflicts) - [ ] Code follows the standards laid out in the [development playbook](https://github.com/cfpb/development) - [ ] Passes all existing automated tests - [ ] Any _change_ in functionality is tested - [ ] New functions are documented (with a description, list of inputs, and expected output) - [ ] Placeholder code is flagged / future todos are captured in comments - [ ] Visually tested in supported browsers and devices (see checklist below 👇) - [ ] Project documentation has been updated (including the "Unreleased" section of the CHANGELOG) - [ ] Reviewers requested with the [Reviewers tool](https://help.github.com/articles/requesting-a-pull-request-review/) :arrow_right: ## Testing checklist ### Browsers - [ ] Chrome - [ ] Firefox - [ ] Safari - [ ] Internet Explorer 8, 9, 10, and 11 - [ ] Edge - [ ] iOS Safari - [ ] Chrome for Android ### Accessibility - [ ] Keyboard friendly - [ ] Screen reader friendly ### Other - [ ] Is useable without CSS - [ ] Is useable without JS - [ ] Flexible from small to large screens - [ ] No linting errors or warnings - [ ] JavaScript tests are passing
Closing this as the models are merged into main under other stories. As the filing process matures we'll write new stories to capture those updates. |
Persist results from data-validator to DB
Schema design things to consider:
The text was updated successfully, but these errors were encountered: