-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for slow query on MS SQL #22522
base: main
Are you sure you want to change the base?
Fix for slow query on MS SQL #22522
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it correctly, this completely removes the check which ensures that is_primary_key
and is_unique
are only set if index_column_count
and index_priority
are either 1 or NULL
. Can we really drop that?
My best guess for the slow query is that using a function and a comparison, that does not establish a relation between the joined tables, inside the ON
expression of a join somehow hits a slow path in Azure. Maybe moving the check into the subquery would help.
@nickrum, is this something the Directus team will investigate further, or would you like me to look into it more? I noticed the issue has been moved to 'ready,' so I'm curious about the status. |
I'm currently looking into it. I cannot reproduce the issue but that's only because I don't have a big and complex enough schema in my ms sql database. we're currently thinking about how we can spin up an arbitrary schema to do performance tests for introspection but right now it's tricky / time consuming. But I can confirm that the query in this PR does not return the same result as the current one. The one in this PR returns more records. I'll have to investigate now that this does not introduce any other problem. However, the fact that the PR does not introduce any additional integration tests to fail is a good sign that we can use the updated query :) |
I compared the query results further and noticed that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as described above
@jaads How did you verify this? Is there a test setup I can use to further evaluate the query's outcome? I'm happy to help resolve this issue. It remains a significant problem in our Azure environment, especially when editing our data model, as we are doing this month for our production website. Each data model edit currently takes 30-60 seconds, which is quite disruptive. |
I enabled logging, started directus on the main branch, copied the introspection query from the logs into my sql editor, (deleted the schema condition at the very end of the query) and run the query. then I adopted the query with the changes from this PR, run it again and notice those mentioned differences. however! now I did the same thing and but this time the output seem to be the same 🫨 so i'm either doing something wrong now or last time. here are the queries for which I compared the results. select [o].[name] AS [table],
[c].[name] AS [name],
[t].[name] AS [data_type],
[c].[max_length] AS [max_length],
[c].[precision] AS [numeric_precision],
[c].[scale] AS [numeric_scale],
CASE
WHEN [c].[is_nullable] = 0 THEN
'NO'
ELSE
'YES'
END AS [is_nullable],
object_definition([c].[default_object_id]) AS [default_value],
[i].[is_primary_key],
[i].[is_unique],
CASE [c].[is_identity]
WHEN 1 THEN
'YES'
ELSE
'NO'
END AS [has_auto_increment],
OBJECT_NAME([fk].[referenced_object_id]) AS [foreign_key_table],
COL_NAME([fk].[referenced_object_id],
[fk].[referenced_column_id]) AS [foreign_key_column],
[cc].[is_computed] as [is_generated],
[cc].[definition] as [generation_expression]
from [master].[sys].[columns] [c]
JOIN [sys].[types] [t] ON [c].[user_type_id] = [t].[user_type_id]
JOIN [sys].[tables] [o] ON [o].[object_id] = [c].[object_id]
JOIN [sys].[schemas] [s] ON [s].[schema_id] = [o].[schema_id]
LEFT JOIN [sys].[computed_columns] AS [cc]
ON [cc].[object_id] = [c].[object_id] AND [cc].[column_id] = [c].[column_id]
LEFT JOIN [sys].[foreign_key_columns] AS [fk]
ON [fk].[parent_object_id] = [c].[object_id] AND [fk].[parent_column_id] = [c].[column_id]
LEFT JOIN (SELECT [ic].[object_id],
[ic].[column_id],
[ix].[is_unique],
[ix].[is_primary_key],
MAX([ic].[index_column_id])
OVER (partition by [ic].[index_id], [ic].[object_id]) AS index_column_count,
ROW_NUMBER() OVER (
PARTITION BY [ic].[object_id], [ic].[column_id]
ORDER BY [ix].[is_primary_key] DESC, [ix].[is_unique] DESC) AS index_priority
FROM [sys].[index_columns] [ic]
JOIN [sys].[indexes] AS [ix] ON [ix].[object_id] = [ic].[object_id]
AND [ix].[index_id] = [ic].[index_id]) AS [i] ON [i].[object_id] = [c].[object_id]
AND [i].[column_id] = [c].[column_id]
AND ISNULL([i].[index_column_count], 1) = 1
AND ISNULL([i].[index_priority], 1) = 1; and the updated query from this PR select [o].[name] AS [table],
[c].[name] AS [name],
[t].[name] AS [data_type],
[c].[max_length] AS [max_length],
[c].[precision] AS [numeric_precision],
[c].[scale] AS [numeric_scale],
CASE
WHEN [c].[is_nullable] = 0 THEN
'NO'
ELSE
'YES'
END AS [is_nullable],
object_definition([c].[default_object_id]) AS [default_value],
[i].[is_primary_key],
[i].[is_unique],
CASE [c].[is_identity]
WHEN 1 THEN
'YES'
ELSE
'NO'
END AS [has_auto_increment],
OBJECT_NAME([fk].[referenced_object_id]) AS [foreign_key_table],
COL_NAME([fk].[referenced_object_id],
[fk].[referenced_column_id]) AS [foreign_key_column],
[cc].[is_computed] as [is_generated],
[cc].[definition] as [generation_expression]
from [master].[sys].[columns] [c]
JOIN [sys].[types] [t] ON [c].[user_type_id] = [t].[user_type_id]
JOIN [sys].[tables] [o] ON [o].[object_id] = [c].[object_id]
JOIN [sys].[schemas] [s] ON [s].[schema_id] = [o].[schema_id]
LEFT JOIN [sys].[computed_columns] AS [cc]
ON [cc].[object_id] = [c].[object_id] AND [cc].[column_id] = [c].[column_id]
LEFT JOIN [sys].[foreign_key_columns] AS [fk]
ON [fk].[parent_object_id] = [c].[object_id] AND [fk].[parent_column_id] = [c].[column_id]
LEFT JOIN (SELECT [ic].[object_id],
[ic].[column_id],
[ix].[is_unique],
[ix].[is_primary_key],
COALESCE(MAX([ic].[index_column_id])
OVER (partition by [ic].[index_id], [ic].[object_id]), 1) AS index_column_count,
COALESCE(ROW_NUMBER() OVER (
PARTITION BY [ic].[object_id], [ic].[column_id]
ORDER BY [ix].[is_primary_key] DESC, [ix].[is_unique] DESC), 1) AS index_priority
FROM [sys].[index_columns] [ic]
JOIN [sys].[indexes] AS [ix] ON [ix].[object_id] = [ic].[object_id]
AND [ix].[index_id] = [ic].[index_id]) AS [i] ON [i].[object_id] = [c].[object_id]
AND [i].[column_id] = [c].[column_id]; are the results the same for you? @boring-joey 🤔 |
What's changed:
Potential Risks / Drawbacks
Review Notes / Questions
Fixes #19486
You can review the related issue here.
#19486