Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine action tables #4459

Closed
wants to merge 75 commits into from

Conversation

dullbananas
Copy link
Collaborator

@dullbananas dullbananas commented Feb 16, 2024

This should cause a huge improvement in query plans, especially for queries that previously reached the from/join collapse limits. For example, getting saved posts might now start with an index scan of the post_actions table, which avoids scanning posts that the user didn't do anything with (or all non-saved posts if I add partial indexes, but I don't know if I should do that).

This will also make the code much cleaner and reduce the size of the database. (Edit: it may or may not reduce size)

Indexes for the new action tables will use INCLUDE WHERE with IS NULL for each action column to keep index-only scans possible.

In the new joins, person_id will not use a bind parameter if it's None, so there can still be separate generic query plans for users that are not logged in.

todo: constraint names, indexes, auto delete rows with no actions

@dessalines
Copy link
Member

Before you go forward and spend too much time on this, it needs a lot of discussion, because we could lose a lot of data integrity solely for the sake of post_view query speed. An update to a person_action table, when that action could be many different columns is a lot more confusing than single-action tables with solid constraints.

There are a lot of inside-postgres things we could do before getting rid of the post_like or comment_like table (unfortunately most of them would be some form of caching / non-source data store tho).

@dullbananas
Copy link
Collaborator Author

dullbananas commented Feb 16, 2024

@dessalines Would that problem be fixed by using a composite type for each action that stores multiple values?

Edit: or multi-column constraints, like (a IS NULL) = (b IS NULL)

This was referenced Mar 25, 2024
@dullbananas
Copy link
Collaborator Author

That would also be okay to do as a scheduled_tasks.rs job, and probably a lot easier, and less taxing on postgres.

That wouldn't be better. It's not easy do to that in a way that doesn't involve checking all rows each time, and a trigger can avoid re-reading rows that don't need to be deleted.

@dullbananas
Copy link
Collaborator Author

dullbananas commented Apr 6, 2024

I will try to implement auto deletion using statements like this instead of triggers:

WITH
    update_result AS (UPDATE post_actions SET like_score = NULL, liked = NULL RETURNING *)
DELETE FROM post_actions
USING (
    -- I tried to put these selected values in a tuple with a name that can be referenced in the `WHERE` clause, but the name throws a parse error in the `WHERE` clause
    SELECT post_actions.post_id, post_actions.person_id, NULL, NULL, NULL, NULL, NULL, NULL, NULL
    FROM update_result AS post_actions
    -- This check only affects performance, and it can't replace the `(post_actions.*) = (rows_with_nulls.*)` check below because this one is done before locking the rows
    WHERE (post_actions.*) = (post_actions.post_id, post_actions.person_id, NULL, NULL, NULL, NULL, NULL, NULL, NULL)
) AS rows_with_nulls
WHERE (post_actions.*) = (rows_with_nulls.*);

This statement runs successfully but I didn't test it with any rows inserted.

Edit: this exact statement won't work because UPDATE can't affect what data is read until the current statement is finished.

@Nutomic
Copy link
Member

Nutomic commented May 17, 2024

Outdated, feel free to reopen when you work on it again.

@Nutomic Nutomic closed this May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants