Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support histograms #283

Closed
maelp opened this issue Jan 31, 2024 · 10 comments
Closed

Support histograms #283

maelp opened this issue Jan 31, 2024 · 10 comments
Labels
documentation Improvements or additions to documentation

Comments

@maelp
Copy link

maelp commented Jan 31, 2024

It's not clear from the documentation how to support histogram plots, is it possible out-of-the-box, or do we need to do our own binning and then use a bar chart?

I think it's one of the most widespread graph, might be useful to have a few examples in the documentation

@jheer jheer added the documentation Improvements or additions to documentation label Jan 31, 2024
@jheer
Copy link
Member

jheer commented Jan 31, 2024

Here is an existing example (among others) consisting of three linked histograms:
https://uwdata.github.io/mosaic/examples/flights-200k.html

And yes, the current model is to perform binning and create a bar chart.

Agreed that the documentation might benefit from a simpler example with "Histogram" in the name!

@maelp
Copy link
Author

maelp commented Jan 31, 2024

Thanks!

@maelp maelp closed this as completed Jan 31, 2024
@maelp maelp reopened this Jan 31, 2024
@domoritz
Copy link
Member

I think it would also help to make the distinction between Mosaic and vgplot clearer. The introduction is very good about talking about Mosaic, what it is and how it works, but maybe we can be more explicit about the fact that vgplot is not the only way to use Mosaic.

@maelp
Copy link
Author

maelp commented Jan 31, 2024 via email

@domoritz
Copy link
Member

Can you elaborate? We already explain extensibility in https://uwdata.github.io/mosaic/why-mosaic/#mosaic-is-extensible and have docs for building clients at https://uwdata.github.io/mosaic/core/#clients.

@Unemyr
Copy link

Unemyr commented Feb 5, 2024

Another example that would be really fantastic would be to understand how to bin timestamp data. There are some really good examples on how to plot data by day of week, or month in year. But I just cant get a linear timescale to plot.

vgplot.bin() throws an error "Binder Error: No function matches the given name and argument types '-(TIMESTAMP, BIGINT)' You might need to add explicit type casts." when you try to do that for example with the rectY mark. Perhaps there is something really rudimentary I am missing, in order to get that to work?

I am able to get an areaY to render with timestamp data, but I believe from performance reasons it would be significantly better to have the data binned first.

@Unemyr
Copy link

Unemyr commented Feb 6, 2024

An update on the histogram with a timescale dimension - I was able to make that work, using the following approach:

  1. Created a customized dateYearMonthDay() function - based off the existing dateMonth() API. Need an input argument to state whether the X1 or X2 parameter shall be calculated (X2 will add a +1 on the year date).

  2. Modified the rectY function to input x1 and x2, referencing the above functions

However, I am not sure this would be considered best practice, should the bin() function be able to seamlessly support this like with other data types (or if it doesn't today, is it the aspiration that it should once implemented)? Feel free to comment on any better approach to achieve this, and I do think an example for this would be very useful as many use timescale elements also for bar charts (business reports et c).

@jheer
Copy link
Member

jheer commented Feb 6, 2024

Hi @Unemyr, this is the direction I would recommend. The vgplot bin transform is specifically focused on binning quantitative values, and by design it does not operate on date-time data and related intervals (year, quarter, month, etc). I'd recommend opening a new feature request issue for support for time bin functions that produce the desired intervals (not unlike what Vega-Lite provides). We'd also be happy to review PRs along these lines.

@Unemyr
Copy link

Unemyr commented Feb 6, 2024

OK noted on that. I would be open to contributing PRs for that later. Thanks for the quick reply!

@domoritz
Copy link
Member

I'll close this for now since mosaic supports histograms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants