Improved polymorphic and STI index suggestions #109

Epigene · 2019-06-20T09:51:34Z

Hi!

I am using v2.1.8 and polymorphic indexes are still suggested despite them already being present in different order, see #107

Furthermore, I believe the suggestion should be in the format [:poly_type, :poly_id]. This is somewhat of a controversial topic, but the reasoning for putting the type first is that it is more useful:

Pros:

Index can be used for searches only by type (in case we want a count of how many records by types there are)

Cons:

Index can not be used to search by just :poly_id, but this should be a non-problem since it can be argued that searching by just polymorphic ID does not guarantee that the correct record is selected.

Additionally, there are false positives when using Rails' STI pattern. If a table has a :type column loldba proposes an [:id, :type] index, which is redundant, especially if an index on just :type is present (and :id is indexed by default).

The text was updated successfully, but these errors were encountered:

plentz · 2020-03-29T03:51:44Z

Sorry for taking so long to reply @Epigene. I just merged #107 which fixes the order problem.

About the ordering of the index itself, can you provide some references for it? I'm pretty sure that modern databases can use only to search by type even if it's on the second position of the index.

Epigene · 2020-07-29T14:57:16Z

Sorry for taking so long to get back to you on this topic, @plentz. I believe the order of index is as relevant as ever and will probably never be an issue of DB engine modernity.

MySQL docs state:

The name index is an index over the last_name and first_name columns. The index can be used for lookups in queries that specify values in a known range for combinations of last_name and first_name values. It can also be used for queries that specify just a last_name value because that column is a leftmost prefix of the index (as described later in this section). Therefore, the name index is used for lookups in the following queries[...]

Similarly, PSQL docs state:

A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned. Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned. For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5. Index entries with c >= 77 would be skipped, but they'd still have to be scanned through. This index could in principle be used for queries that have constraints on b and/or c with no constraint on a — but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.

This is easy to see using the [:last_name, :first_name] index example and these four records, in order:

Doe, John
Doe, Ziggy
Example, Fred
Example, John

A query WHERE first_name = 'John' can not (efficiently) use the index because to get all Johns the DB engine must read through all last names anyway.

Epigene · 2020-07-29T15:05:53Z

To be clear, I'm arguing that an index of order [:poly_type, :poly_id] is better than [:poly_id, :poly_type] because it supports an efficient WHERE poly_type = 'MyModel', which is a common use-case, whereas [:poly_id, :poly_type] index supports an efficient WHERE poly_id = '1', which is unlikely to be a use-case.

Both indexes fare equally well when queries have both terms like WHERE poly_type = 'MyModel' AND poly_id = '1'.

plentz self-assigned this Jul 1, 2019

plentz added the waiting feedback label Mar 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved polymorphic and STI index suggestions #109

Improved polymorphic and STI index suggestions #109

Epigene commented Jun 20, 2019 •

edited

plentz commented Mar 29, 2020

Epigene commented Jul 29, 2020 •

edited

Epigene commented Jul 29, 2020

Improved polymorphic and STI index suggestions #109

Improved polymorphic and STI index suggestions #109

Comments

Epigene commented Jun 20, 2019 • edited

plentz commented Mar 29, 2020

Epigene commented Jul 29, 2020 • edited

Epigene commented Jul 29, 2020

Epigene commented Jun 20, 2019 •

edited

Epigene commented Jul 29, 2020 •

edited