Replies: 25 comments 6 replies
-
There are many ways to achieve this via pure SQL views. Pls see |
Beta Was this translation helpful? Give feedback.
-
Hi @tirumaraiselvan, thanks for the answer. I will expose my case in more detail. CREATE VIEW appointments_by_category AS
SELECT category, COUNT(*) AS count
FROM appointments
GROUP BY category; Calling it from Hasura graphQL endpoint: query MyQuery {
appointments_by_category {
category
count
}
} But if I wanted to filter by date: query MyQuery {
appointments_by_category(where: {date: {_gt: "2017-01-01"}}) {
category
count
}
} This will not work since the field "date" is not available in the results. I could create the view like this: CREATE VIEW appointments_by_category AS
SELECT category, COUNT(*) AS count
FROM appointments
WHERE date > '2017-01-01'
GROUP BY category; But I want the filter to be dynamic. Obviously, I don't want to always filter by the same date. |
Beta Was this translation helpful? Give feedback.
-
Will a view like the below work?
|
Beta Was this translation helpful? Give feedback.
-
Nope :( |
Beta Was this translation helpful? Give feedback.
-
@hector Right. The only solution I can imagine with "no code" is:
And then run an auto-generated aggregate query on it:
|
Beta Was this translation helpful? Give feedback.
-
Mmm I see your point. |
Beta Was this translation helpful? Give feedback.
-
any solution? |
Beta Was this translation helpful? Give feedback.
-
@joni7777 if you have the same problem that I do you will need to implement a SQL function that does the grouping for you. Check this docs: https://hasura.io/docs/1.0/graphql/manual/schema/custom-functions.html#custom-sql-functions I can imagine Hasura might implement a solution for this but I understand it is not trivial. For starters, the type of data returned when grouping changes for every possible combination. |
Beta Was this translation helpful? Give feedback.
-
@hector Yea i saw that option, and i did try that, but when you want to group by data and summarise it, AND filter by timestamp the for specific sets of the data its not possible cuz the group by rows dont have the timestamp. Example: user | views | timestamp Results wanted for timestamp >= "2020-03-01 ...." AND <= "2020-03-02 ...." |
Beta Was this translation helpful? Give feedback.
-
You need to build a SQL function which accepts the parameters you need. In your case I think it would be the dates, right? Something like this: CREATE FUNCTION users_grouped_function(start_datetime datetime, end_datetime datetime)
RETURNS SETOF users_grouped AS $$
SELECT users.id, sum(users.views)
FROM users
WHERE start_datetime <= users.timestamp AND end_datetime >= users.timestamp
GROUP BY users.id;
$$ LANGUAGE sql STABLE; I wrote this on the fly so there might be errors. An important thing to note is that the return type has to be a table. This is one of the constraints of Hasura (https://hasura.io/docs/1.0/graphql/manual/schema/custom-functions.html#supported-sql-functions). |
Beta Was this translation helpful? Give feedback.
-
Yea just saw it, used it now its a good enough solution but i really wish there was group by agg and join with remote schema 🙌🏻 |
Beta Was this translation helpful? Give feedback.
-
Our app is heavy on the data visualization so With this setup then we can also still use all of Hasur's relationship logic. Does Hasura rock or what?? To find all the work log entries that employees did on a job for a particular customer grouped by whatever and easy to graph the results without JS madness we can do this: query{
grouped_work_entry_aggregate(
args: {
where: {userId: {_in: ["id1", "id2"]}}
groupByDay: false
groupByUser: true
groupByCustomer: false
}
order_by: {day: desc}
){
aggregate{
count
}
nodes{
duration
avgDuration
user{displayName}
day
customer{
name
}
}
}
} SQL function: DROP FUNCTION IF EXISTS public.grouped_work_entry;
CREATE FUNCTION public.grouped_work_entry(
"where" json default null,
"groupByDay" boolean default false,
"groupByWeek" boolean default false,
"groupByMonth" boolean default false,
"groupByJob" boolean default false,
"groupByUser" boolean default false,
"groupByCustomer" boolean default false
)
-- Must return a Hasura-tracked table of matching shape/type
RETURNS SETOF work_entry_generic_group AS $$
SELECT
-- Optional Grouping setup
-- Sets a field to NULL if arg grouping flag was not set
CASE WHEN "groupByCustomer" THEN "job"."customerId" ELSE null END as "gCustomer",
CASE WHEN "groupByDay" THEN date_trunc('day', "work_entry"."start") ELSE null END as "gDay",
CASE WHEN "groupByWeek" THEN date_trunc('week', "work_entry"."start") ELSE null END as "gWeek",
CASE WHEN "groupByMonth" THEN date_trunc('month', "work_entry"."start") ELSE null END as "gMonth",
CASE WHEN "groupByUser" THEN "work_entry"."userId" ELSE null END as "gUser",
CASE WHEN "groupByJob" THEN "work_entry"."jobId" ELSE null END as "gJob",
-- Counts
COUNT(DISTINCT "work_entry"."userId"),
COUNT(DISTINCT "work_entry"."id"),
COUNT(DISTINCT "work_entry"."jobId"),
COUNT(DISTINCT "job"."customerId"),
-- Work entry duration stats
SUM("work_entry"."duration"),
AVG("work_entry"."duration"),
MIN("work_entry"."duration"),
MAX("work_entry"."duration"),
-- More work entry time stats
MIN("work_entry"."start"),
MAX("work_entry"."end"),
MIN("work_entry"."startTime"),
MAX("work_entry"."endTime")
FROM "work_entry"
INNER JOIN "job" ON "work_entry"."jobId" = "job"."id"
WHERE
-- Customer
("where"#>>'{job,customerId,_eq}' IS NULL OR ("job"."customerId" = "where"#>>'{job,customerId,_eq}'))
AND ("where"#>>'{job,customerId,_in}' IS NULL OR ((("where"#>'{job,customerId,_in}')::jsonb) ? "job"."customerId"))
-- User
AND ("where"#>>'{userId,_eq}' IS NULL OR ("work_entry"."userId" = "where"#>>'{userId,_eq}'))
AND ("where"#>>'{userId,_in}' IS NULL OR ((("where"#>'{userId,_in}')::jsonb) ? "work_entry"."userId"))
-- Job
AND ("where"#>>'{jobId,_eq}' IS NULL OR ("work_entry"."jobId" = "where"#>>'{jobId,_eq}'))
AND ("where"#>>'{jobId,_in}' IS NULL OR ((("where"#>'{jobId,_in}')::jsonb) ? "work_entry"."jobId"))
-- Times
AND ("where"#>>'{start,_lte}' IS NULL OR ("work_entry"."start" <= ("where"#>>'{start,_lte}')::timestamp with time zone))
AND ("where"#>>'{start,_gte}' IS NULL OR ("work_entry"."start" >= ("where"#>>'{start,_gte}')::timestamp with time zone))
GROUP BY
"gDay",
"gWeek",
"gMonth",
"gJob",
"gCustomer",
"gUser"
$$ LANGUAGE sql STABLE; |
Beta Was this translation helpful? Give feedback.
-
@nolandg I like your solution and I made something very similar :) |
Beta Was this translation helpful? Give feedback.
-
@nolandg How "generic" do you think that custom function is? I'm wondering whether it would be possible to write a JS/TS script that uses introspection to auto-generate at least mostly functional copies of these group by functions per-table. Seems like it could be neat and fairly useful. You think it might map cleanly from table + relationship definition to SQL function templates? |
Beta Was this translation helpful? Give feedback.
-
@GavinRay97 The function I showed above is not generic at all but yes, if we had more tables we needed that grouping on, we would definitely make some sort of code generator to produce the SQL. I think if you defined a limited amount of 1st-level fields/values/aggregations you wanted, it would not be hard to map them cleanly. Making it completely generic and covering all cases would produce a huge function of questionable performance and utility I fear. What might work is a describing the shape of aggregations and conditions you want and then making a parser to produce that function. You would also need to handle the Hasura side and make a generator to create (and run?) the migrations required to create the Hasura-tracked table that mirrors the function--the function output table type--with all the relations you want on that. Again, unfeasible to make it completely generic and cover all cases especially when it comes to relationships. You'd need to describe the limits you want. In a perfect world, Hasura would do all this for us :-) |
Beta Was this translation helpful? Give feedback.
-
I would also love to have this feature built in. Adding functions to the database is not ideal, but is a workaround we will use for now. |
Beta Was this translation helpful? Give feedback.
-
Just to give another use case which would greatly benefit from server side group by: charts. Charts usually require aggregations. Taking https://github.com/hasura/graphql2chartjs (https://www.youtube.com/watch?v=153iv1-qFuc) as an example, a bar chart showing how many albums an artist has. A view is used to count the nb of albums per artist. If go a bit beyond this simple example and we image a more complex page, we would like to filter artists by time/genre/etc.
For extremely large data sets even filtering needs to be done on indexed fields and responsive aggregate queries usually requires pre-aggregation, which actually reduces the problem to medium datasets. Also subscriptions with grouping might be postponed IMHO, if problematic to implement in the first iteration. |
Beta Was this translation helpful? Give feedback.
-
How does this work ? You didn't |
Beta Was this translation helpful? Give feedback.
-
@revskill10 Not sure I understand what you mean by " If you need to uniquely identify the returned rows then you can generate an id for each row in the above SQL. If non-deterministic ids are ok for your application then you can simply use SQL UUID function. If you need deterministic ids for the grouped rows than you could create a hash of your determining columns right in the SQL above. We eventually required deterministic ids because we started using subscriptions on this grouped table. If subscriptions don't return deterministic ids then it wrecks havoc with your Apollo cache and UI is continually refreshing. This is what we are doing now for generating the id column: CONCAT(
min("work_entry"."id"::text), '_',
min("work_entry"."userId"), '_',
min("work_entry"."jobId"), '_',
min("work_entry"."taskId"), '_',
min("work_entry"."start"), '_',
CASE WHEN "groupByDay" THEN date_trunc('day', "work_entry"."start")::date ELSE null END, '_',
CASE WHEN "groupByWeek" THEN date_trunc('week', "work_entry"."start")::date ELSE null END, '_',
CASE WHEN "groupByMonth" THEN date_trunc('month', "work_entry"."start")::date ELSE null END
), It hashes all the fields our application deems as determining identity of a unique grouped row. |
Beta Was this translation helpful? Give feedback.
-
As an aside, are there any Posgres date gurus following this? We're struggling and think we've hit an impossibility with timezones and grouping. Is it possible to Postgres is complains that I can't use any timezone conversion functions within an aggregation function. That would have been a great option because we know the timezone info for each entry and we do want to be able to adjust reports for different timezones. So...we don't see many options. Maybe not saving as |
Beta Was this translation helpful? Give feedback.
-
@nolandg Thanks for the explanation.
If we wish to have all the group by condition as true, for each row , there should be a count/max/min. How to achieve that? |
Beta Was this translation helpful? Give feedback.
-
Hey folks, I just wanted to share an elegant solution to this problem with current Hasura features. Let's take the OP's schema from the original comment. Suppose, there is a table called We want to group by To do this, we can define a new table which has unique values of the group by column(s). In this case, we can create a table called We can then create a foreign key relationship from Now, we can query like below:
We start querying from the "group by table" and add a filter condition on the aggregate of the actual table. Tada! |
Beta Was this translation helpful? Give feedback.
-
Is groupby ever gonna be supported by hasura? |
Beta Was this translation helpful? Give feedback.
-
While we don't have plans to support It's not out of beta yet, but you can find out more information in the recording of the Hasura Community Call for March 2023, and contact us directly to try it out. |
Beta Was this translation helpful? Give feedback.
-
Please refer to the RFC for GroupBy that has been posted here #9712. We welcome your feedback as we prepare to start work on adding GroupBy to Hasura's auto-generated GraphQL schema. |
Beta Was this translation helpful? Give feedback.
-
Hi,
Thanks for your cool project.
I need to do some grouping by in a query. In the docs it is suggested to use a view (https://docs.hasura.io/1.0/graphql/manual/queries/derived-data.html).
My problem is that I want to filter the rows before they are grouped (using a WHERE).
For example, imagine you have a table of appointments. I want to filter them based on their date (e.g. events after November) and group them by their category.
If I use a view, the grouping happens first and I cannot access the date column afterwards thus, I cannot filter.
For the moment, I solved this writing my own function but then I lose much of power and simplicity that comes with hasura (for example, having to implement the filters myself).
Did I miss something? I cannot stop thinking there must be a more elegant solution I missed.
Beta Was this translation helpful? Give feedback.
All reactions