Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order deep proposal queries by another column #383

Merged
merged 1 commit into from
Jun 1, 2023
Merged

Conversation

bickelj
Copy link
Contributor

@bickelj bickelj commented May 31, 2023

Without this change, a non-deterministic sort order can cause an HTTP 500 response with message "...values and fields must be sorted...".

With this change, the sort order is first by position (as before) and then also by proposal field value (pfv) ID because there can be many pfv rows for a given application form field (aff) row. This should be sufficient to guarantee deterministic order such that the application join will succeed.

Issue #381 Unexpected 500 response: "values and fields must be sorted"

See also #198 and #210.

@codecov
Copy link

codecov bot commented May 31, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (b0a09a5) 95.24% compared to head (5ddf751) 95.24%.

❗ Current head 5ddf751 differs from pull request most recent head ae90b97. Consider uploading reports for the commit ae90b97 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #383   +/-   ##
=======================================
  Coverage   95.24%   95.24%           
=======================================
  Files          66       66           
  Lines         968      968           
  Branches      161      161           
=======================================
  Hits          922      922           
  Misses         43       43           
  Partials        3        3           

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@jasonaowen jasonaowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these both contain inner joins with proposal_field_values, and its id column is a unique surrogate key, I think this should stably sort everything. 👍 on the untested belief that this fixes an immediate bug.

However, doing application joins like this is bad. This is what a database is for; we shouldn't be doing this in code. It's also not the first time we've had this conversation, although probably the first time it's bitten us like this.

As a side note, it is somewhat strange to me to be ordering the results of a query by columns that are not returned in that query!

Thank you for tracking this down, @bickelj, and coming up with a quick fix!

@jasonaowen
Copy link
Contributor

However, doing application joins like this is bad. This is what a database is for; we shouldn't be doing this in code. It's also not the first time we've had this conversation, although probably the first time it's bitten us like this.

See also #314!

@@ -10,4 +10,4 @@ INNER JOIN proposal_field_values pfv
INNER JOIN proposal_versions pv
ON pv.id = pfv.proposal_version_id
WHERE pv.proposal_id = :proposalId
ORDER BY pv.version DESC, pfv.position;
ORDER BY pv.version DESC, pfv.position, pfv.id;
Copy link
Member

@slifty slifty Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First: I agree with you all that doing application joins isn't the right approach in the long run (and we have a plan to move from it)

Second: So long as we are enriching data, a cleaner solution, IMHO, is to implement the enrichment algorithm in a way that does not require artificially coupled queries -- e.g. one that will take data and, if it needs sorting, just sort it rather than throwing an error. Alternatively, having this context-specific sorting occur as an array manipulation applied to the query result but before the result is passed, rather than in the query.

From a design PoV I think that is more maintainable than this approach, as this now has two unrelated queries that need to be sorted in a specific way for reasons that don't actually make sense in the context of the queries themselves.

I assume the desire to not sort in code is an optimization decision, but we don't currently have reason to believe that the optimization is necessary. On the flip side, the optimization has resulted in this bug and if we merge the PR I would argue it is resulting in tech debt in the form of un-intuitive coupling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…if we merge the PR I would argue it is resulting in tech debt in the form of un-intuitive coupling.

If this design is so wrong that we should prioritize a wider fix, that's reasonable, but I don't think it should block pushing out this production hotfix.

Not to be demo-driven, but we do have a big demo tomorrow and likely a bunch of stakeholders logging in right after, and their experience with a broken site is net-worse for the project than this relatively benign change, which is at least idiomatic to our approach even if we're operating under the wrong idiom.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait wait, I have a better reply:

PULL REQUESTS WELCOME

:trollface:

Copy link
Member

@slifty slifty Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well to be clear I'm not proposing a wider fix, just that instead of sorting in the query we should just invoke .sort((a,b) => ...) in the immediate context of where a particular context-specific order matters.

which is at least idiomatic to our approach even if we're operating under the wrong idiom.

I'm suggesting this is not idiomatic to our approach, and that the correct solution within our current approach is to sort the data via typescript or bypass the need for sorted data in our join.

PULL REQUESTS WELCOME

I know this is a joke, and not to serious face reply but I don't think we should get in the habit of reviewers needing to take on the burden of owning an issue simply because they have feedback!

It's fine to decide to merge tech debt and open an issue to capture that tech debt if it seems like it'd be too much effort to make the change as part of this PR!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I wanna triple down on the last part though: I intentionally left this as a comment rather than "request changes" in the name of not blocking progress!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree that reviewers have to be free to suggest improvements without a requisite burden of providing them, or else they'll hold their feedback and that's bad! My joke was truly a non sequitur and I apologize for the insinuation. While it's obviously not up to me, diving into a PR that provides the fix wouldn't be how I would prioritize your time, that's how much of a joke it was.

Re: the approach, I'll bow out and let you guys discuss it further, but by idiomatic here I meant "we're already sorting in the query; fix that sort to not blow up" and by non-idiomatic I mean "changing where we sort".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So clearly "idiomatic" is the wrong word here. "Minimally different from current design but not broken" perhaps?

Copy link
Member

@slifty slifty Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I was thinking of the current sorting in the query as being use-case-agnostic (e.g. selecting something by proposal ID can happen for all sorts of reasons, we picked a way to sort that makes generic sense). Whereas this new sort is very specific to JUST this use case of selecting in order to join proposal field values.

That said, I now get what you're saying, which I think is essentially "even if we don't want to rely on the query to sort, were already doing that, and we might as well update the arbitrary sort order to fix the situation even if we shouldn't be relying on arbitrary sort order"

So, FINE JUSTIN YOU ARE RIGHT (even if I think the most correct fix to overly coupled pieces of code is to decouple them gosh darn it!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW … ID … JUST … FINE JUSTIN YOU ARE RIGHT

My speed-reading strategy of skimming for all-caps words continues to pay off!

Without this change, a non-deterministic sort order can cause an HTTP
500 response with message "...values and fields must be sorted...".

With this change, the sort order is first by position (as before) and
then also by proposal field value (pfv) ID because there can be many
pfv rows for a given application form field (aff) row. This should be
sufficient to guarantee deterministic order such that the application
join will succeed.

Issue #381 Unexpected 500 response: "values and fields must be sorted"

See also #198
and #210.
@bickelj
Copy link
Contributor Author

bickelj commented Jun 1, 2023

ae90b97 rebases on main.

@bickelj
Copy link
Contributor Author

bickelj commented Jun 1, 2023

The approach here (meaning in the original PR that created the ORDER BY) was a compromise between "let the database do the data things" and "let the application do the data things", so the "let the database sort so the application can do an O(1) join" is on the former tack while the "do a join in the application at all" is on the latter tack. Yeah, it's not great, but as pointed out above, this is what it is right now and we have yet to come up with a completely satisfactory-to-everybody solution.

@bickelj bickelj merged commit 6af28f7 into main Jun 1, 2023
2 checks passed
@bickelj bickelj deleted the sort-order-matters branch June 1, 2023 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants