Support `DISTINCT` post-lookup aggregates #1183

ethan-readyset · 2024-03-27T19:53:53Z

Description

This change removed support for post-lookup aggregates that use DISTINCT (e.g. COUNT(DISTINCT …)) because our implementation was incorrect.

Consider the following example:

CREATE TABLE t (x int, y int)
INSERT INTO t (x, y) VALUES (1, 1), (2, 1)

which gives us the table

x | y
-----
1 | 1
2 | 1

The query SELECT COUNT(DISTINCT y) FROM t WHERE x > 0 should return 1, since there is only one distinct value for y across x = 1 and x = 2; however, Readyset returns 2. The graph for this query looks something like this:

Base --> Distinct[y over values of x] --> Count[x] --> Reader

The distinct node contains one row for each value of x, and the count node contains a count of 1 for each of these values of x. What is not reflected in the count node is that the counts for each value of x actually include overlapping values of y (i.e. y = 1 is reflected across both values of x). When the reader node is queried for x > 0, it sums the counts across all the values of x in that range, which means we end up double-counting y = 1.

We could probably resolve this by compiling queries with distinct aggregates and range keys to look something like this:

Base --> Distinct --> Reader

and then computing the count at read time. That would allow us to de-duplicate rows across multiple keys.

We should also investigate other potential strategies.

Change in user-visible behavior

Yes

Requires documentation change

Yes

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `DISTINCT` post-lookup aggregates #1183

Support `DISTINCT` post-lookup aggregates #1183

ethan-readyset commented Mar 27, 2024

Support DISTINCT post-lookup aggregates #1183

Support DISTINCT post-lookup aggregates #1183

Comments

ethan-readyset commented Mar 27, 2024

Support `DISTINCT` post-lookup aggregates #1183

Support `DISTINCT` post-lookup aggregates #1183