Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion on data exploration functionalities across HoloViz #326

Open
maximlt opened this issue Sep 12, 2022 · 1 comment
Open

Discussion on data exploration functionalities across HoloViz #326

maximlt opened this issue Sep 12, 2022 · 1 comment

Comments

@maximlt
Copy link
Member

maximlt commented Sep 12, 2022

In this hvPlot issue Marc asked for adding a UI to the hvPlotExplorer to sort data. I thought that this made sense, after all the explorer is meant to be for data exploration, and sorting data can be useful in that context. However I also thought that this opens the door to adding more transforms to the explorer, and soon users will ask for filtering capabilities. In the end, the explorer would re-implement what was implemented in Lumen.

We recently discussed the features of HoloViz, and that there's a kind of separation between Panel and the HV tools, where Panel is used to create web apps and the HV tools to create plots. There's a natural intersection between these two worlds: data exploration dashboards. This kind of dashboard is defined by a pipeline, that starts with one or or more data sources, maybe internally alters them, and offers the users ways to explore the data, with indicators, tables and plots (often linked), and with widgets to transform the data even more (e.g. filters). People have built so many of this kind of dashboard to track COVID for instance.

HoloViz users can already create and share this sort of web app, there's everything they need in the ecosystem to do so. However I believe it's not as straightforward as it could be, as they have to gather the different bits they need from different sources (Panel/Param, and hvPlot or HoloViews/Geoviews or a combination of those). The best resource is actually the Tutorial hosted on holoviz.org, however I think it's not so easily discoverable and it's certainly quite long (its length is great for PyCon, probably not that great for the average internet visitor). I also believe HoloViz is one of the best options out there to produce this kind of dashboard.

HoloViz users are indeed offered many ways to create data exploration "pipelines" writing Python code:

  • hvPlot provides the explorer
  • hvPlot also provides .interactive, which also allows for loading data since support for function as input was added
  • Lumen has recently gained a Pipeline class
  • Combining Panel+HoloViews (or just pure HoloViews) it's also possible to create data exploration pipelines

I would like to see if it would be possible for us to document one way to create data exploration dashboards? Of course things can be composable, that's how HoloViz is designed, but I think it'd help users if we can guide them in a more opinionated way. A concrete realization of that would be a 30 min tutorial, starting from scratch and ending up with a deployed application. I don't know what API it would rely on (and it may require some additional work to get there), it might just be Lumen all the way (noting here that Lumen depends on panel of course but also on hvplot). And if this tutorial is successful, I guess it should be linked from every site (actually, the HoloViz tutorial could also be linked the same way) so that users can quickly assess the capabilities of HoloViz, and not just the library they're using.

And ultimately I'd hope this work would help me find out whether or not the hvplot explorer should gain transforming/filtering capabilities, or compose with Lumen, etc.

@jbednar
Copy link
Member

jbednar commented Oct 14, 2022

These are all important points for discussion. Thanks for raising them, and sorry it's taken me so long to dig down to this issue in my inbox! I think we should make a distinction between "exploring data" and "building a data-exploration tool". Exploring data is an end in itself, directly producing insight, while building is about producing a tool that (later, we hope) will be used to gain insight.

hvPlot Explorer, as the name suggests, focuses on exploring, while Lumen Builder, as the name also suggests, focuses on building. I'm in favor of making Lumen Builder be a great environment for data-exploration as well, where we can expect someone to fire it up, select some data, and then spend the rest of their session in the Builder doing data transformations, filtering, and plotting as needed to draw conclusions, saving their work as an app but really just as a bookmark to what they were doing without necessarily expecting to give it to an external user. Despite that goal of mine, (a) Lumen Builder isn't set up well for that just yet, because various building-related tools and interfaces don't get out of the way, and (b) the Builder is a fairly heavyweight and constrained environment for doing so, compared to e.g. the Jupyter cell that the Explorer runs in, so not everyone is going to want to do their exploring within Lumen Builder.

Conversely, if using the Explorer, the goal is for a user to be able to select what they want to see at that moment, with immediate feedback that can lead directly to insight. I don't think the Explorer should directly support complex, multi-step definitions of filters and transformations in general, but I do think that features supporting exploration do belong there, including an option for sorting a particular dimension if otherwise a Curve or Area plot will come out completely jumbled. I think the perspective viewer provides a good guide to the scope of such an explorer: directly control what you are able to see, but don't try to do complex data-manipulation pipelines. Provide immediate insight and/or design and configure a plot; don't try to build full apps.

Given those considerations, I think we should have documentation covering these very different use cases:

  • Want to explore a dataset? If you can figure out how to get ahold of the data somehow in a Jupyter notebook, then you can explore it immediately using the hvPlot Explorer, without any further coding needed. Here's how!
  • Have you built a Python data-processing and visualization pipeline that you'd like to make interactive, so that you or your users can see the effect of parameters that control it? That's really easy to do with hvPlot .interactive, letting you create custom, lightweight data-exploration apps "on the fly", whenever you need them. Here's how!
  • Are you more comfortable in a GUI application and would like to be able to design a data-transformation pipeline and data-exploration app? You can use Lumen Builder for that, and here's how!

And then separately there can be docs that explain how all these paths intersect and overlap, but none of that is needed for the individual topics here, each of which are self contained and important on their own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants