New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plot does not show outliers at most zoomed-out level #234
Comments
The purpose of Plotly-resampler, as implied by its name, is to resample (or aggregate) data in order to improve the scalability of time-series visualization. This aggregation process involves selecting a fixed number of data points within a given range. You can think of this as selecting single data points for sub-intervals within this range. When zoomed out, a larger interval is used to select the data points, which may result in the omission of certain interesting points. However, these data-aggregation algorithms are designed to capture the general trend and extreme values. On the other hand, when zooming in, the interval decreases, leading to a more detailed representation of the data. I'm rather intrigued on why wouldn't you want the resample on zoom functionality? So what can you do:
Hope this answers your question, |
I was of the impression that "somehow" plotly-resampler will always shows at least 1 point for a cluster of points which are "close enough". In that sense, I thought that there is always a representative and more points can be visualized by zooming in.
I understand that there will be information loss in zoomed out view. Just to confirm, information loss is possible such that there is no representative for a cluster of points right?
I am fine with resample on zoom functionality. I just wanted to ensure that if I zoom "sufficiently enough", I should be able to view complete information. I will try with |
Hi @vinay-hebb, I hope you are doing well, and sorry for this late reply, I had some holidays! 🌴
A: you could certainly implement such an algorithm, but such algorithms often require more than 1 pass over the data (which can be time-constraining). As of now, All supported aggregation algorithms only use a single (and sometimes even a parrallelizable single) pass over the raw data to selects a datapoint for each bin. These bins can be defined as: It is this utilization of linear/bin-wise data aggregators ensures that plotly-resampler is able to scale to even Billions of data-points per trace! 🐎
A: Indeed, when you zoom in, the aggregation algorithm will re-run and select the same number of data points! :) (i.e.
A: As we use bins, from which we will only select a fixed amount of data points, and there is a possibility of more than 1 cluster per bin, you may, indeed, lose some points that representative for other clusters.
This should always be true! Also note how the orange
Good question, there is no straightforward answer; I would hint to use an I can maybe point you to #247, in which I elaborate more on the ideal number samples; and the default downsampler (which is MinMaxLTTB) Hope this helps you further, |
May be a noob question
Setup: I am using plotly-resampler with dynamic aggregation
Requirement: I want to see outliers in zoomed out level and want to zoom-in to visualize data (without resampling) and understand the nuances of the data
Questions:
Problem:
Zoomed-out screenshot
Zoomed-in screenshot
Green circled points appear with zooming-in, Is it possible to see them even at zoomed-out level? As they are missed in zoomed-out level, I might miss them in my data analysis
The text was updated successfully, but these errors were encountered: