Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Performance Issue for filtering on varchar columns in MS_SQL database #2052

Merged
merged 22 commits into from
Mar 28, 2024

Conversation

abhishekkumams
Copy link
Contributor

@abhishekkumams abhishekkumams commented Feb 20, 2024

Why make this change?

  • Closes MSSQL API slow with large tables due to incorrect varchar/nvarchar in sp_executesql #1973
    • Slow response of DAB when filtering on varchar columns in MS_SQL compared to other DBs.
    • Same query in mssql taking around ~1sec, compared to <50ms in other DBs for large number of rows, eg. 20 million
    • This was happening because the query generated by DAB when runs on Sql Server for varchar columns, it automatically assumes the filter value of type nvarchar, due to which it does implicit cast leading to slowness.
    • To fix this we needed to identify if the filter is for varchar column, we explitily specify it to be varchar for faster response.

What is this change?

  • We are modifying the DbCommand and adding extra information that is sent to SqlClient.

  • We introduced a new property called SqlDbType, By Default for String DbType, SqlDbType was considered as nvarchar. now we are updating this property to the correct sql type.

  • This allows db to directly execute the command without type casting due to type mismatch.

  • Carved out the code which creates DbCommand in QueryExecuter, and added an override method in MsSqlQueryExecuter. here we set the property SqlDbType for the Parameters.

    • Link to Discussion which talks about this highlighting the issue : SWA hosted DAB API response time (performance) #1964 (comment)
    • Added a new property SqlDbType in Column class, currently we used to have only DotNet DbType which doesn't differenciate between varchar and Nvarchar. So, adding this SqlDbType gave us the freedom to modify query based on the sql db type.

How was this tested?

  • Since, this is a performance issue and only a new column got introduced. The main aim here was to make sure excising functionality doesn't break.
  • Added few more extra tests.

The performance difference is more visible when there are large number of rows. So, I added 20 million records and ran perf test.
Below are the performance results with avg response time:

varchar(before) nvarchar(before
GraphQL ~734ms ~23ms
REST ~655ms ~24ms
varchar(after) nvarchar(after)
GraphQL ~18ms ~23ms
REST ~18ms ~25ms

BEFORE CHANGE

GRAPHQL

VARCHAR

image
image

NVARCHAR

image
image

REST

VARCHAR

image

NVARCHAR

image

AFTER CHANGE

GRAPHQL

VARCHAR

image
image

NVARCHAR

image
image

REST

VARCHAR

image

NVARCHAR

image

@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@seantleonard
Copy link
Contributor

Based on your screenshots, it seems like "varchar" variant had the perf boost:

varchar(before) nvarchar(before
GraphQL 734ms 19ms
REST 655ms 24ms
varchar(after) nvarchar(after)
GraphQL 18ms 23ms
REST 18ms 25ms

Question about:

CAST(@param1 AS VARCHAR(MAX))

why cast instead of setting the sqldbtype on the dbconnectionparameter that is created?

@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last question about tests.

src/Core/Models/GraphQLFilterParsers.cs Outdated Show resolved Hide resolved
src/Core/Models/GraphQLFilterParsers.cs Outdated Show resolved Hide resolved
src/Core/Resolvers/MsSqlQueryExecutor.cs Show resolved Hide resolved
@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@seantleonard
Copy link
Contributor

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Contributor

@aaronburtle aaronburtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the performance increases here are awesome! Just had a couple clarifying questions, a suggestion for some potential comment.

@abhishekkumams abhishekkumams merged commit 577a3f1 into main Mar 28, 2024
9 checks passed
@abhishekkumams abhishekkumams deleted the dev/abhishekkuma/fix_dab_mssql_perf_issue branch March 28, 2024 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Let's make this better mssql an issue thats specific to mssql perf
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSSQL API slow with large tables due to incorrect varchar/nvarchar in sp_executesql
5 participants