New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor track_deletes #1892
Refactor track_deletes #1892
Conversation
await configService.createProviderConfig({ | ||
unique_key: DEMO_GITHUB_CONFIG_KEY, | ||
provider: 'github', | ||
environment_id: env!.id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated to track_deletes: making environment id not optional, same from sync and syncJob below
@@ -1,14 +1,12 @@ | |||
export interface DataResponse { | |||
id?: string; | |||
[index: string]: unknown | undefined | string | number | boolean | Record<string, string | boolean | number | unknown>; | |||
[index: string]: null | undefined | object | string | number | boolean | Record<string, string | boolean | number>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated but the linter was yelling at me
} | ||
|
||
export interface UpsertSummary { | ||
addedKeys: string[]; | ||
updatedKeys: string[]; | ||
deletedKeys?: string[]; | ||
affectedInternalIds: string[]; | ||
affectedExternalIds: string[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used anywhere
@@ -401,25 +401,17 @@ export default class SyncRun { | |||
async finishSync(models: string[], syncStartDate: Date, version: string, totalRunTime: number, trackDeletes?: boolean): Promise<void> { | |||
let i = 0; | |||
for (const model of models) { | |||
let deletedKeys: string[] = []; | |||
if (!this.isWebhook && trackDeletes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@khaliqgant I have kept the condition here but I am not sure why !this.isWebhook
is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because a webhook and a sync with track deletes share the same config
37034c1
to
7fcf79b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well, minor comments 👌🏻
I have one test that I think failed, but I don't know enough the system to be sure it's not what we want.
Run a sync three times:
-
- Add record id: 1 ✅
-
- Delete record id: 1 ✅
-
- Add back record id: 1 ❌
The record columnexternal_deleted_at
is still filled with a date
- Add back record id: 1 ❌
I guess the next steps after a few weeks is to drop the table and pending_delete
column?
}) | ||
.whereNot({ | ||
sync_job_id: generation | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering about the performance of this query. There is no index currently mapping those conditions, do you have a way to run an explain in prod with a large sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in Slack, it seems alright but If not, an index like this nango_connection_id, model, sync_id, job_id where external_is_deleted = false
could work
packages/shared/lib/services/sync/run.service.integration.test.ts
Outdated
Show resolved
Hide resolved
Good catch. will fix. Question though: should the creation date of a deleted records that is being brought back be reset or not. ie: should it be considered as added again or simply updated? |
The table will be dropped after the migration and the new table is not going to have a pending_delete column |
ac3033f
to
643b245
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, very nice improvement 🙏🏻
END IF; | ||
RETURN NEW; | ||
END; | ||
$$ LANGUAGE plpgsql; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way to have this logic in the code? I'm really not fond of sql logic, it's hard to debug and easy to forget
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not a huge fan of the fact it is easy to forget. The alternative would be to fetch all existing records and compare before merging
643b245
to
953152f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works as expected from my tests, thanks!
953152f
to
f2e2ae7
Compare
Describe your changes
Refactoring track_deletes logic so we don't need an extra snapshot tables.
The logic is now simple:
upsert
adds/updates records (or setdeletedAt
), including updating the sync_job_id that serves as a generationIdtrack_deletes == true
then every records without adeletedAt
that doesn't have the samesync_job_id
as the current one is marked as deletedIssue ticket number and link
https://linear.app/nango/issue/NAN-599/track-deletes-refactoring
Checklist before requesting a review (skip if just adding/editing APIs & templates)