Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not use jsonb_set in postgreSQL #1814

Merged
merged 3 commits into from Mar 7, 2024

Conversation

TBonnin
Copy link
Collaborator

@TBonnin TBonnin commented Mar 7, 2024

Describe your changes

Appending _nango_metadata to the record json in postgres tends to be slow when json is big
This commit remove the call to jsonb_set in the query and manipulate the json in js land to add _nango_metadata child attributes

Issue ticket number and link

https://linear.app/nango/issue/NAN-515/fetching-big-records-is-slow

Checklist before requesting a review (skip if just adding/editing APIs & templates)

  • I added tests, otherwise the reason is:
  • I added observability, otherwise the reason is:
  • I added analytics, otherwise the reason is:

Copy link

linear bot commented Mar 7, 2024

last_action: LastAction;
deleted_at: Date | null;
deleted_at: string | null;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those were never date since they are coming from pg as json encoded date.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should update the types in the node-client too to match these: https://github.com/NangoHQ/nango/blob/master/packages/node-client/lib/index.ts#L83

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix in 7c90610

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knex is doing some transformation so I guess it was kinda true :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not even. knex was fine with those fields defined as date but at runtime typeof would be string 😡

cursor: string;
}

// DEPRECATED
export interface DataRecordWithMetadata extends RecordMetadata {
record: object;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only use in the deprecated records function so it should be removed when we finally delete the deprecated record fetching method

cursor: expect.stringMatching(/^[A-Za-z0-9+/]+={0,2}$/) // base64 encoded string
});
expect(response?.next_cursor).toBe(null); // no next page
});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a test to make sure the format of metadata isn't modified with those changes

const rawResults: RawDataRecordResult[] = await query.select(
// PostgreSQL stores timestamp with microseconds precision
// however, javascript date only supports milliseconds precision
// we therefore convert timestamp to string (using to_json()) in order to avoid precision loss
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically we cannot query pg timestamp column into a js Date without losing the microseconds precision (which then make the cursor logic wrong) so all dates must be json encoded string


return result[0] as unknown as SyncDataRecord;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated but not used anywhere

Copy link
Member

@khaliqgant khaliqgant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and thanks for adding tests. I have a follow up to add more tests that will hopefully assist when you do the records rework. Feedback inline about the node-client

@TBonnin TBonnin force-pushed the tbonnin/nan-515/no-json-manipulation-in-pg branch 2 times, most recently from fc1b772 to 174df9c Compare March 7, 2024 08:17
@TBonnin
Copy link
Collaborator Author

TBonnin commented Mar 7, 2024

@khaliqgant can I have a second look. I fix the type in the node client and did some code cleanup

Copy link
Member

@khaliqgant khaliqgant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Appending _nango_metadata to the record json in postgres tends to be
slow when json is big
This commit remove the call to jsonb_set in the query and manipulate the
json in js land to add _nango_metadata child attributes
@TBonnin TBonnin force-pushed the tbonnin/nan-515/no-json-manipulation-in-pg branch from 174df9c to 03dfb45 Compare March 7, 2024 09:03
@TBonnin TBonnin enabled auto-merge (squash) March 7, 2024 09:03
Copy link
Contributor

@bodinsamuel bodinsamuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 🚀
Thanks for doing that, did you noticed something to make you jump on it?

Comment on lines +265 to 268
// DEPRECATED
export interface DataRecordWithMetadata extends RecordMetadata {
record: object;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DEPRECATED
export interface DataRecordWithMetadata extends RecordMetadata {
record: object;
}
/**
* @deprecated
*/
export interface DataRecordWithMetadata extends RecordMetadata {
record: object;
}

if (!connection) {
throw new Error(`Connection '${nangoConnectionId}' not found`);
}
const chunkSize = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal opinion: code without line break is a bit harder to read in the long run

expect(success).toBe(true);
expect(error).toBe(null);
expect(response?.records.length).toBe(n);
const timestampRegex = /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{6}\+\d{2}:\d{2}$/;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had created an helper in vitest to achieve something similar without copy pasting

const dateReg = /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}Z$/;

expect.extend({
  toBeIsoDate: (received: any) => {
    if (!dateReg.test(received)) {
      return {
        message: () => `expected ${received} to be an ISO Date`,
        pass: false,
      };
    }
    return { pass: true, message: () => '' };
  },
});

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extended expect at first as well but since it was actually shorter to use stringMatching I went for that

connection.connection_id,
connection.provider_config_key,
connection.environment_id,
model,
undefined, // delta
limit, // limit
undefined, // filter
cursor // cursor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json params would make this better :D

@TBonnin TBonnin merged commit bf28260 into master Mar 7, 2024
16 checks passed
@TBonnin TBonnin deleted the tbonnin/nan-515/no-json-manipulation-in-pg branch March 7, 2024 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants