Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of Reporter class #641

Merged
merged 45 commits into from May 8, 2023
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
32e300c
Creating Reporter and PandasReporter classes with their corresponding…
victorgarcia98 Apr 14, 2023
ce17568
Added Tibber Reporter.
victorgarcia98 Apr 17, 2023
848e39a
- Fixing wong DA Price value.
victorgarcia98 Apr 20, 2023
d9867fc
Updating VAT units.
victorgarcia98 Apr 20, 2023
1c9de43
- Attatching report to sensor
victorgarcia98 Apr 21, 2023
f61825b
Fixing wrong arguments to search_beliefs method.
victorgarcia98 Apr 21, 2023
b01f5c3
FIxing wrong type conversion logic.
victorgarcia98 Apr 21, 2023
487233c
Small reporter fixes (#647)
Flix6x Apr 24, 2023
f34f5e1
Add superclass to Reporter that will be common to all three data gene…
victorgarcia98 Apr 24, 2023
1f8457e
Merge remote-tracking branch 'origin/626-add-reporter-class' into 626…
victorgarcia98 Apr 24, 2023
dd21c99
Add start, end, resolution, beliefs_after and beliefs_before to the `…
victorgarcia98 Apr 24, 2023
58d405f
Add FLEXMEASURES_DEFAULT_DATASOURCE config to be the feault datasourc…
victorgarcia98 Apr 24, 2023
bebc320
Fixing wrong input type.
victorgarcia98 Apr 27, 2023
0dfe5f4
Rename DataGenerator class to DataGeneratorMixin
victorgarcia98 Apr 27, 2023
7830e33
Reduce logging level from warning to debug.
victorgarcia98 Apr 27, 2023
e378ac6
Merge branch 'main' into 626-add-reporter-class
victorgarcia98 Apr 27, 2023
5166264
Merge branch 'main' into 626-add-reporter-class
victorgarcia98 Apr 27, 2023
6ab84df
Register Reporter to the app context.
victorgarcia98 Apr 27, 2023
79fe12f
Allowing to use BeliefsDataFrame specific method in the schema.
victorgarcia98 Apr 27, 2023
a64550b
Merge remote-tracking branch 'origin/626-add-reporter-class' into 626…
victorgarcia98 Apr 27, 2023
4c86878
Fixed wrong method. TODO: test with a plugin.
victorgarcia98 Apr 28, 2023
5ca52eb
Using module name instead of the module object.
victorgarcia98 Apr 28, 2023
12267d3
use belief_time instead of beliefs_before and beliefs_after (#652)
Flix6x May 1, 2023
723ed02
Merge remote-tracking branch 'origin/626-add-reporter-class' into 626…
victorgarcia98 May 1, 2023
5a6590e
Merge branch 'main' into 626-add-reporter-class
victorgarcia98 May 1, 2023
cc47742
Fixing example.
victorgarcia98 May 1, 2023
cc11809
Fixing grammar.
victorgarcia98 May 1, 2023
c50df17
Require at least 1 input sensor for the tb_query_config.
victorgarcia98 May 1, 2023
cc8aa22
Merge branch 'main' into 626-add-reporter-class
victorgarcia98 May 2, 2023
c36b904
Bug fix: compute function was overriding the variables to the default…
victorgarcia98 May 2, 2023
3ad4113
Changing end to get 24h and fix assert condition to detect NaN.
victorgarcia98 May 2, 2023
aba68ea
Adding belief time variable to schema.
victorgarcia98 May 2, 2023
cc545e1
Avoid deserializing multiple times.
victorgarcia98 May 2, 2023
9d5dc57
Add scope="module" to avoid recreating objects in DB.
victorgarcia98 May 2, 2023
801e358
fix: remove time paramters (start, end, ...) from the Reporter class …
victorgarcia98 May 5, 2023
99c47b4
style: type hints improvements
victorgarcia98 May 8, 2023
83a776f
style: rename tb_query_config for beliefs_search_config_schema
victorgarcia98 May 8, 2023
97f3b15
style: remove comment
victorgarcia98 May 8, 2023
aa8a266
Merge branch 'main' into 626-add-reporter-class
victorgarcia98 May 8, 2023
912aa4e
docs: add entry to changelog
victorgarcia98 May 8, 2023
9d8c737
style: clarifying attribute
victorgarcia98 May 8, 2023
1383b4a
style: fix docstring
victorgarcia98 May 8, 2023
08020f6
refactor: rename beliefs_search_config_schema beliefs_search_configs
victorgarcia98 May 8, 2023
9fbb699
Merge branch 'main' into 626-add-reporter-class
victorgarcia98 May 8, 2023
96c03b3
style: typo
victorgarcia98 May 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions documentation/changelog.rst
Expand Up @@ -9,6 +9,9 @@ v0.14.0 | June XX, 2023
New features
-------------

* Introduction of the classes `Reporter` and `PandasReporter` [see `PR #641 <https://www.github.com/FlexMeasures/flexmeasures/pull/641>`_]
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved


Bugfixes
-----------

Expand Down
9 changes: 9 additions & 0 deletions documentation/configuration.rst
Expand Up @@ -268,6 +268,15 @@ Set a negative value to persist forever.

Default: ``3600``

.. _datasource_config:

FLEXMEASURES_DEFAULT_DATASOURCE
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The default DataSource of the resulting data from `DataGeneration` classes.

Default: ``"FlexMeasures"``


.. _planning_horizon_config:

Expand Down
6 changes: 6 additions & 0 deletions flexmeasures/app.py
Expand Up @@ -100,6 +100,12 @@ def create(

register_db_at(app)

# Register Reporters
from flexmeasures.utils.coding_utils import get_classes_module
from flexmeasures.data.models.reporting import Reporter

app.reporters = get_classes_module("flexmeasures.data.models.reporting", Reporter)

# add auth policy

from flexmeasures.auth import register_at as register_auth_at
Expand Down
16 changes: 16 additions & 0 deletions flexmeasures/data/models/data_sources.py
Expand Up @@ -4,11 +4,27 @@
import timely_beliefs as tb

from flexmeasures.data import db
from flask import current_app

if TYPE_CHECKING:
from flexmeasures.data.models.user import User


class DataGeneratorMixin:
@classmethod
def get_data_source_info(cls: type) -> dict:
"""
Create and return the data source info, from which a data source lookup/creation is possible.

See for instance get_data_source_for_job().
"""
source_info = dict(
name=current_app.config.get("FLEXMEASURES_DEFAULT_DATASOURCE")
) # default

return source_info


class DataSource(db.Model, tb.BeliefSourceDBMixin):
"""Each data source is a data-providing entity."""

Expand Down
157 changes: 157 additions & 0 deletions flexmeasures/data/models/reporting/__init__.py
@@ -0,0 +1,157 @@
from __future__ import annotations
from typing import Optional, Union, Dict

import pandas as pd

from flexmeasures.data.schemas.reporting import ReporterConfigSchema
from flexmeasures.data.models.time_series import Sensor
from flexmeasures.data.models.data_sources import DataGeneratorMixin


from datetime import datetime, timedelta

import timely_beliefs as tb


class Reporter(DataGeneratorMixin):
"""Superclass for all FlexMeasures Reporters."""

__version__ = None
__author__ = None
__data_generator_base__ = "Reporter"

sensor: Sensor = None

reporter_config: Optional[dict] = None
reporter_config_raw: Optional[dict] = None
schema = ReporterConfigSchema
data: Dict[str, Union[tb.BeliefsDataFrame, pd.DataFrame]] = None

def __init__(
self, sensor: Sensor, reporter_config_raw: Optional[dict] = None
) -> None:
"""
Initialize a new Reporter.

Attributes:
:param sensor: sensor where the output of the reporter will be saved to.
:param reporter_config_raw: unserialized configuration of the reporter.
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved
"""

self.sensor = sensor

if not reporter_config_raw:
reporter_config_raw = {}

self.reporter_config_raw = reporter_config_raw

def fetch_data(
self,
start: datetime,
end: datetime,
input_resolution: timedelta = None,
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved
belief_time: datetime = None,
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved
):
"""
Fetches the time_beliefs from the database
"""

self.data = {}
for tb_query in self.beliefs_search_config_schema:
_tb_query = tb_query.copy()
# using start / end instead of event_starts_after/event_ends_before when not defined
event_starts_after = _tb_query.pop("event_starts_after", start)
event_ends_before = _tb_query.pop("event_ends_before", end)
resolution = _tb_query.pop("resolution", input_resolution)
belief_time = _tb_query.pop("belief_time", belief_time)

sensor: Sensor = _tb_query.pop("sensor", None)
alias: str = _tb_query.pop("alias", None)

bdf = sensor.search_beliefs(
event_starts_after=event_starts_after,
event_ends_before=event_ends_before,
resolution=resolution,
beliefs_before=belief_time,
**_tb_query,
)

# store data source as local variable
for source in bdf.sources.unique():
self.data[f"source_{source.id}"] = source

# store BeliefsDataFrame as local variable
if alias:
self.data[alias] = bdf
else:
self.data[f"sensor_{sensor.id}"] = bdf

def update_attribute(self, attribute, default):
if default is not None:
setattr(self, attribute, default)

def compute(
self,
start: datetime,
end: datetime,
input_resolution: timedelta | None = None,
belief_time: datetime | None = None,
**kwargs,
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved
) -> tb.BeliefsDataFrame:
"""This method triggers the creation of a new report.

The same object can generate multiple reports with different start, end, input_resolution
and belief_time values.

In the future, this function will parse arbitrary input arguments defined in a schema.
"""

# deserialize configuration
if self.reporter_config is None:
self.deserialize_config()
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved

# fetch data
self.fetch_data(start, end, input_resolution, belief_time)

# Result
result = self._compute(start, end, input_resolution, belief_time)

# checking that the event_resolution of the output BeliefDataFrame is equal to the one of the output sensor
assert self.sensor.event_resolution == result.event_resolution

# Assign sensor to BeliefDataFrame
result.sensor = self.sensor

return result

def _compute(
self,
start: datetime,
end: datetime,
input_resolution: timedelta = None,
belief_time: datetime = None,
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved
) -> tb.BeliefsDataFrame:
"""
Overwrite with the actual computation of your report.

:returns BeliefsDataFrame: report as a BeliefsDataFrame.
"""
raise NotImplementedError()

def deserialize_config(self):
"""
Validate the report config against a Marshmallow Schema.
Ideas:
- Override this method
- Call superclass method to apply validation and common variables deserialization (see PandasReporter)
- (Partially) extract the relevant reporter_config parameters into class attributes.

Raises ValidationErrors or ValueErrors.
"""

self.reporter_config = self.schema.load(
self.reporter_config_raw
) # validate reporter config
self.beliefs_search_config_schema = self.reporter_config.get(
"beliefs_search_config_schema"
) # extracting TimeBelief query configuration parameters
138 changes: 138 additions & 0 deletions flexmeasures/data/models/reporting/pandas_reporter.py
@@ -0,0 +1,138 @@
from __future__ import annotations

from typing import Any
from datetime import datetime, timedelta

from flask import current_app
import timely_beliefs as tb

from flexmeasures.data.models.reporting import Reporter
from flexmeasures.data.schemas.reporting.pandas_reporter import (
PandasReporterConfigSchema,
)


class PandasReporter(Reporter):
"""This reporter applies a series of pandas methods on"""

__version__ = "1"
__author__ = None
Flix6x marked this conversation as resolved.
Show resolved Hide resolved
schema = PandasReporterConfigSchema()
transformations: list[dict[str, Any]] = None
final_df_output: str = None

def deserialize_config(self):
# call super class deserialize_config
super().deserialize_config()

# extract PandasReporter specific fields
self.transformations = self.reporter_config.get("transformations")
self.final_df_output = self.reporter_config.get("final_df_output")

def _compute(
self,
start: datetime,
end: datetime,
input_resolution: timedelta | None = None,
belief_time: datetime | None = None,
) -> tb.BeliefsDataFrame:
"""
This method applies the transformations and outputs the dataframe
defined in `final_df_output` field of the report_config.
"""

# apply pandas transformations to the dataframes in `self.data`
self._apply_transformations()

final_output = self.data[self.final_df_output]

return final_output

def get_object_or_literal(self, value: Any, method: str) -> Any:
"""This method allows using the dataframes as inputs of the Pandas methods that
are run in the transformations. Make sure that they have been created before accessed.

This works by putting the symbol `@` in front of the name of the dataframe that we want to reference.
For instance, to reference the dataframe test_df, which lives in self.data, we would do `@test_df`.

This functionality is disabled for methods `eval`and `query` to avoid interfering their internal behaviour
given that they also use `@` to allow using local variables.

Example:
>>> self.get_object_or_literal(["@df_wind", "@df_solar"], "sum")
[<BeliefsDataFrame for Wind Turbine sensor>, <BeliefsDataFrame for Solar Panel sensor>]
"""

if method in ["eval", "query"]:
if isinstance(value, str) and value.startswith("@"):
current_app.logger.debug(
"Cannot reference objects in self.data using the method eval or query. That is because these methods use the symbol `@` to make reference to local variables."
)
return value
victorgarcia98 marked this conversation as resolved.
Show resolved Hide resolved

if isinstance(value, str) and value.startswith("@"):
value = value.replace("@", "")
return self.data[value]

if isinstance(value, list):
return [self.get_object_or_literal(v, method) for v in value]

return value

def _process_pandas_args(self, args: list, method: str) -> list:
"""This method applies the function get_object_or_literal to all the arguments
to detect where to replace a string "@<object-name>" with the actual object stored in `self.data["<object-name>"]`.
"""
for i in range(len(args)):
args[i] = self.get_object_or_literal(args[i], method)
return args

def _process_pandas_kwargs(self, kwargs: dict, method: str) -> dict:
"""This method applies the function get_object_or_literal to all the keyword arguments
to detect where to replace a string "@<object-name>" with the actual object stored in `self.data["<object-name>"]`.
"""
for k, v in kwargs.items():
kwargs[k] = self.get_object_or_literal(v, method)
return kwargs

def _apply_transformations(self):
"""Convert the series using the given list of transformation specs, which is called in the order given.

Each transformation specs should include a 'method' key specifying a method name of a Pandas DataFrame.

Optionally, 'args' and 'kwargs' keys can be specified to pass on arguments or keyword arguments to the given method.

All data exchange is made through the dictionary `self.data`. The superclass Reporter already fetches BeliefsDataFrames of
the sensors and saves them in the self.data dictionary fields `sensor_<sensor_id>`. In case you need to perform complex operations on dataframes, you can
split the operations in several steps and saving the intermediate results using the parameters `df_input` and `df_output` for the
input and output dataframes, respectively.

Example:

The example below converts from hourly meter readings in kWh to electricity demand in kW.
transformations = [
{"method": "diff"},
{"method": "shift", "kwargs": {"periods": -1}},
{"method": "head", "args": [-1]},
],
"""

previous_df = None

for transformation in self.transformations:
df_input = transformation.get(
"df_input", previous_df
) # default is using the previous transformation output
df_output = transformation.get(
"df_output", df_input
) # default is OUTPUT = INPUT.method()

method = transformation.get("method")
args = self._process_pandas_args(transformation.get("args", []), method)
kwargs = self._process_pandas_kwargs(
transformation.get("kwargs", {}), method
)

self.data[df_output] = getattr(self.data[df_input], method)(*args, **kwargs)

previous_df = df_output
Empty file.