Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine resolution for plots from data first, then from session #49

Merged
merged 5 commits into from Mar 12, 2021

Conversation

nhoening
Copy link
Contributor

@nhoening nhoening commented Mar 1, 2021

Our code to create data plots relied on the session holding the resolution. However, when we use it through the API (e.g. #39) there is no session, and we should actually look to the data we got passed.

We get a pandas DataFrame which often has no index.freq attribute ― because the index was a BeliefsDataFrame index which was then simplified.

So I added logic to infer the resolution from the data, which works pretty well.

@nhoening nhoening self-assigned this Mar 1, 2021
@nhoening nhoening added the UI label Mar 1, 2021
@nhoening nhoening requested a review from Flix6x March 5, 2021 20:51
Copy link
Contributor

@Flix6x Flix6x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an improvement. But I feel that what would be a better fix is to keep around the resolution from the point at which we query the db. We currently seem to lose the information when we simplify the index of the returned BeliefsDataFrame to a regular pandas DataFrame.

flexmeasures/ui/views/analytics.py Outdated Show resolved Hide resolved
@nhoening
Copy link
Contributor Author

It's an improvement. But I feel that what would be a better fix is to keep around the resolution from the point at which we query the db. We currently seem to lose the information when we simplify the index of the returned BeliefsDataFrame to a regular pandas DataFrame.

I didn't see a way to preserve freq during index simplification. Well, preserve is the wrong word, as it's not in the index in the first place. Maybe it's an improvement that could be done on the Pandas side, but I'm sure it's complex.

Finally, we might have other ways to get data here, so I guess this safety can't hurt.

@Flix6x
Copy link
Contributor

Flix6x commented Mar 11, 2021

I didn't see a way to preserve freq during index simplification. Well, preserve is the wrong word, as it's not in the index in the first place. Maybe it's an improvement that could be done on the Pandas side, but I'm sure it's complex.

Rather than in the index, the BeliefsDataFrame explicitly keeps track of the event resolution as a metadata property: df.event_resolution. I guess we should talk about and make clear the semantic difference between data frequency and event resolution. Data frequency describes a regular interval between data indices. Event resolution describes the time interval described by each data point. Thinking of a bar chart may help here: event_resolution is the bar width, while data frequency is the spacing between bars (incl. bar width). For example:

>>> import timely_beliefs as tb
>>> import pandas as pd
>>> from datetime import datetime, timedelta
>>> import pytz
>>> source = tb.BeliefSource("EPEX")
>>> sensor = tb.Sensor("EPEX SPOT day-ahead price", event_resolution=timedelta(hours=1), unit="EUR/MWh")
>>> s = pd.Series([63, 60, 61], index=pd.date_range(datetime(2000, 1, 3, 9), periods=3, tz=pytz.utc))
>>> bdf = tb.BeliefsDataFrame(s, belief_horizon=timedelta(hours=0), source=source, sensor=sensor)
>>> bdf
                                                                        event_value
event_start               belief_horizon source cumulative_probability             
2000-01-03 09:00:00+00:00 0 days         EPEX   0.5                              63
2000-01-04 09:00:00+00:00 0 days         EPEX   0.5                              60
2000-01-05 09:00:00+00:00 0 days         EPEX   0.5                              61
>>> bdf.event_resolution
datetime.timedelta(seconds=3600)
>>> bdf.index.get_level_values(level="event_start").freq)
None
>>> bdf.index.get_level_values(level="event_start").inferred_freq)
'D'

Here, data points are spaced 1 day apart, while each data point describes an hourly price.

Finally, we might have other ways to get data here, so I guess this safety can't hurt.

Agreed.

@nhoening
Copy link
Contributor Author

So is it wrong to assume that event_resolution can be used to assume the frequency as I did in the last commit? (I am thinking it's a better guess than nothing, but we could still let Pandas do the guessing to avoid mishaps).

@Flix6x
Copy link
Contributor

Flix6x commented Mar 11, 2021

That's a tough question, because I believe it is posed in the form of a misconception. You're looking to determine the resolution (bar width), not the frequency (spacing). So event_resolution is a great first guess, and index frequency is a reasonable fallback.

I'm preparing a commit now that makes simplify_index retain the event_resolution, so it is actually picked up by the line you added.

@nhoening nhoening merged commit 1b9080c into main Mar 12, 2021
@nhoening nhoening deleted the find-resolution-for-plots branch March 12, 2021 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants