Window functions do not work in query with aggregate #259

frtennis1 · 2024-01-01T20:05:00Z

This is half bug half feature request. The following snippet does not currently work,

x = np.random.normal(scale=50, size=100)
y = x + np.random.normal(scale=5, size=len(x))
df = pd.DataFrame({'x_val': x, 'y': y})

MosaicWidget(
    {
        "plot": [
            {
                "mark": "ruleX",
                "y1": {"agg": "avg(y) - stddev(y)/sqrt(count(y))"},
                "y2": {"agg": "avg(y) + stddev(y)/sqrt(count(y))"},
                "x": {"expr": "ntile(10) over (order by x_val)"},
                "marker": "tick",
                "data": {"from": "df"},
            }
        ]
    },
    data={'df': df},
)

That's because the generated sql is problematic,

> SELECT avg(y) - stddev(y)/sqrt(count(y)) AS "y1", avg(y) + stddev(y)/sqrt(count(y)) AS "y2", ntile(10) over (order by x_val) AS "x" FROM "df" AS "source" GROUP BY "x"
Binder Error: GROUP BY clause cannot contain window functions!

It'd be pretty nice if the above chart snippet could work because the intent seems unambiguous to me.

A suggestion for making it work would be to modify the query generation to move the non-aggregate clause computation into a "WITH" statement where there are no restrictions on window functions, i.e.

WITH "__mosaicTemp" AS (SELECT *, ntile(10) over (order by x_val) AS "x" FROM "df") SELECT avg(y) - stddev(y)/sqrt(count(y)) AS "y1", avg(y) + stddev(y)/sqrt(count(y)) AS "y2", "x" FROM "__mosaicTemp" GROUP BY "x"

which I think should not be a perf cost in cases where it's not needed because duckdb will optimize it away (?).

The text was updated successfully, but these errors were encountered:

frtennis1 linked a pull request Jan 1, 2024 that will close this issue

feat: allow window functions in groupby by adding with statement #260

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Window functions do not work in query with aggregate #259

Window functions do not work in query with aggregate #259

frtennis1 commented Jan 1, 2024

Window functions do not work in query with aggregate #259

Window functions do not work in query with aggregate #259

Comments

frtennis1 commented Jan 1, 2024