Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New database model for Task Queue entries, to reduce network traffic #881

Closed
wants to merge 5 commits into from

Conversation

aaronchongth
Copy link
Member

@aaronchongth aaronchongth commented Jan 19, 2024

What's new

  • New TaskQueueEntry model to be used for Task Queue, instead of TaskState (which contains all the phases), to reduce network traffic
  • Removed warn time feature for UX reasons
  • Generated new api-client
  • Frontend changes for this

Self-checks

  • I have prototyped this new feature (if necessary) on Figma
  • I'm familiar with and follow this Typescript guideline
  • I added unit-tests for new components
  • I tried testing edge cases
  • I tested the behavior of the components that interact with the backend, with an e2e test

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>
Signed-off-by: Aaron Chong <aaronchongth@gmail.com>
Signed-off-by: Aaron Chong <aaronchongth@gmail.com>
Copy link

codecov bot commented Jan 19, 2024

Codecov Report

Attention: 1115 lines in your changes are missing coverage. Please review.

Comparison is base (fe0e808) 49.35% compared to head (78fa1fc) 44.97%.
Report is 52 commits behind head on deploy/hammer.

Files Patch % Lines
packages/dashboard/src/components/map-app.tsx 0.00% 171 Missing ⚠️
.../dashboard/src/components/delivery-alert-store.tsx 0.00% 100 Missing ⚠️
...ackages/react-components/lib/tasks/create-task.tsx 0.00% 84 Missing ⚠️
...kages/dashboard/src/components/tasks/tasks-app.tsx 0.00% 78 Missing ⚠️
...es/api-server/api_server/models/delivery_alerts.py 12.94% 74 Missing ⚠️
packages/react-components/lib/tasks/utils.ts 4.76% 54 Missing and 6 partials ⚠️
...ckages/api-server/api_server/repositories/tasks.py 36.26% 58 Missing ⚠️
...ckages/api-server/api_server/routes/tasks/tasks.py 24.32% 56 Missing ⚠️
packages/dashboard/src/components/door-summary.tsx 0.00% 51 Missing ⚠️
...shboard/src/components/tasks/task-cancellation.tsx 0.00% 48 Missing ⚠️
... and 37 more
Additional details and impacted files
@@                Coverage Diff                @@
##           deploy/hammer     #881      +/-   ##
=================================================
- Coverage          49.35%   44.97%   -4.38%     
=================================================
  Files                285      291       +6     
  Lines               7564     8514     +950     
  Branches            1050     1323     +273     
=================================================
+ Hits                3733     3829      +96     
- Misses              3682     4485     +803     
- Partials             149      200      +51     
Flag Coverage Δ
api-server 73.64% <36.82%> (-7.16%) ⬇️
dashboard 13.84% <3.53%> (-1.23%) ⬇️
react-components 44.14% <15.95%> (-3.94%) ⬇️
rmf-auth ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@koonpeng koonpeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how TaskQueueEntry improve perf? It seems to be the same as TaskState.

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>
@aaronchongth
Copy link
Member Author

Yeah certainly, this hopes to resolve 2 things,

  • TaskState carries a lot of nested information, TaskState->phases->events->detail. Basically carrying all the information and timestamps that we don't require on the frontend. I still need to retrieve logs and review, but it appears that for problematic tasks that require human-intervention, retrieval of 10 states (pagination) can take up to 3 or 4 seconds, as opposed to 70ms when retrieving 10 tasks that executed well.
  • overall we still require server-side sorting and filtering, which means those fields need to be explicitly defined, as opposed to be in a JSONField. For maintainability (and not to interfere with automatically generated models from rmf_api_msgs), I thought it best to explicitly define a model for this table, the frontend will retrieve task states (with all information) as needed

let me know what you think, or if there are better ways to achieve this

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>
@koonpeng
Copy link
Collaborator

koonpeng commented Jan 22, 2024

  • TaskState carries a lot of nested information, TaskState->phases->events->detail. Basically carrying all the information and timestamps that we don't require on the frontend. I still need to retrieve logs and review, but it appears that for problematic tasks that require human-intervention, retrieval of 10 states (pagination) can take up to 3 or 4 seconds, as opposed to 70ms when retrieving 10 tasks that executed well.

TaskState don't seem to contain events, phases etc, I only see them in task logs. Unless RMF is including all the data in the raw message, but even so, we still need to save the data. We can fetch only the data that we need, avoiding this json. Another thing we can do is to change it from JsonField to TextField so that both postgres and tortoise does not try to process it. But moving the json to another table may also help, depending on the database internals, it may improve indexing performance and data locality.

I think what we can do are:

  • Keep TaskState as is, just improve the way we fetch data.
  • Change data to TextField, since we are changing schema already, might as well also move it to another table, but don't rename TaskState, instead create a new table called RawTaskState or something.
  • Keep TaskState as in, but do not populate the data field anymore. Instead create a new RawTaskState table and populate the json there instead.

Moving data to another table also applies to all other RMF data, so we if do option 2 might be a good idea to do it at once to minimize migration pains.

@aaronchongth
Copy link
Member Author

aaronchongth commented Jan 22, 2024

Per VC, what we know now,

  • the task phases coming in are accurate, just that it increases without limit over time as robots get stuck

what we should do,

other TODOs afterwards,

  • use a separate endpoint for alert acknowledgement, instead of appending to phases
  • perhaps move pickup and destination, to request, use new api's to query those (but not sure how to sort or filter across multiple database models for now). Maybe the request for the task should just be embedded in TaskState, so there is a single source of truth

@aaronchongth
Copy link
Member Author

Changing to draft, as this involves more discussions

@aaronchongth aaronchongth marked this pull request as draft January 22, 2024 15:11
@aaronchongth aaronchongth deleted the hammer/fix-bandwidth branch May 27, 2024 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants