Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to store flex-model & flex-context on sensor & asset #809

Closed
nhoening opened this issue Aug 11, 2023 · 11 comments
Closed

Ability to store flex-model & flex-context on sensor & asset #809

nhoening opened this issue Aug 11, 2023 · 11 comments

Comments

@nhoening
Copy link
Contributor

These two dicts should be looked up on sensor and/or asset by the scheduler (or other code that might need them). Then, sending each time per API call is not necessary.

This connects to #774 and #335, as well.

How would adding and editing work? Right now we can set dicts on attributes in the CLI and API I believe, and editing in the UI works. But maybe that support is not quite enough. For instance, I'd like schema checking these attributes, when they are so important as these two are.

@victorgarcia98
Copy link
Contributor

Regarding the first point, I would adapt the Schedulers to work in the DataGenerators framework:

class DataGenerator:
__data_generator_base__: str | None = None
_data_source: DataSource | None = None
_config: dict = None
_parameters: dict = None
_parameters_schema: Schema | None = None
_config_schema: Schema | None = None
_save_config: bool = True
_save_parameters: bool = False
def __init__(
self,
config: dict | None = None,
save_config=True,
save_parameters=False,
**kwargs,
) -> None:
"""Base class for the Schedulers, Reporters and Forecasters.
The configuration `config` stores static parameters, parameters that, if
changed, trigger the creation of a new DataSource. Dynamic parameters, such as
the start date, can go into the `parameters`. See docstring of the method `DataGenerator.compute` for
more details. Nevertheless, the parameter `save_parameters` can be set to True if some `parameters` need
to be saved to the DB. In that case, the method `_clean_parameters` is called to remove any field that is not
to be persisted, e.g. time parameters which are already contained in the TimedBelief.
Create a new DataGenerator with a certain configuration. There are two alternatives
to define the parameters:
1. Serialized through the keyword argument `config`.
2. Deserialized, passing each parameter as keyword arguments.
The configuration is validated using the schema `_config_schema`, to be defined by the subclass.
`config` cannot contain the key `config` at its top level, otherwise it could conflict with the constructor keyword argument `config`
when passing the config as deserialized attributes.
Example:
The configuration requires two parameters for the PV and consumption sensors.
Option 1:
dg = DataGenerator(config = {
"sensor_pv" : 1,
"sensor_consumption" : 2
})
Option 2:
sensor_pv = Sensor.query.get(1)
sensor_consumption = Sensor.query.get(2)
dg = DataGenerator(sensor_pv = sensor_pv,
sensor_consumption = sensor_consumption)
:param config: serialized `config` parameters, defaults to None
:param save_config: whether to save the config into the data source attributes
:param save_parameters: whether to save the parameters into the data source attributes
"""
self._save_config = save_config
self._save_parameters = save_parameters
if config is None and len(kwargs) > 0:
self._config = kwargs
DataGenerator.validate_deserialized(self._config, self._config_schema)
elif config is not None:
self._config = self._config_schema.load(config)
elif len(kwargs) == 0:
self._config = self._config_schema.load({})
def _compute(self, **kwargs) -> List[Dict[str, Any]]:
raise NotImplementedError()
def compute(self, parameters: dict | None = None, **kwargs) -> List[Dict[str, Any]]:
"""The configuration `parameters` stores dynamic parameters, parameters that, if
changed, DO NOT trigger the creation of a new DataSource. Static parameters, such as
the topology of an energy system, can go into `config`.
`parameters` cannot contain the key `parameters` at its top level, otherwise it could conflict with keyword argument `parameters`
of the method compute when passing the `parameters` as deserialized attributes.
:param parameters: serialized `parameters` parameters, defaults to None
"""
if self._parameters is None:
self._parameters = {}
if parameters is None:
self._parameters.update(self._parameters_schema.dump(kwargs))
else:
self._parameters.update(parameters)
self._parameters = self._parameters_schema.load(self._parameters)
return self._compute(**self._parameters)
@staticmethod
def validate_deserialized(values: dict, schema: Schema) -> bool:
schema.load(schema.dump(values))
@classmethod
def get_data_source_info(cls: type) -> dict:
"""
Create and return the data source info, from which a data source lookup/creation is possible.
See for instance get_data_source_for_job().
"""
source_info = dict(
source=current_app.config.get("FLEXMEASURES_DEFAULT_DATASOURCE")
) # default
source_info["source_type"] = cls.__data_generator_base__
source_info["model"] = cls.__name__
return source_info
@property
def data_source(self) -> "DataSource":
"""DataSource property derived from the `source_info`: `source_type` (scheduler, forecaster or reporter), `model` (e.g AggregatorReporter)
and `attributes`. It looks for a data source in the database the marges the `source_info` and, in case of not finding any, it creates a new one.
This property gets created once and it's cached for the rest of the lifetime of the DataGenerator object.
"""
from flexmeasures.data.services.data_sources import get_or_create_source
if self._data_source is None:
data_source_info = self.get_data_source_info()
attributes = {"data_generator": {}}
if self._save_config:
attributes["data_generator"]["config"] = self._config_schema.dump(
self._config
)
if self._save_parameters:
attributes["data_generator"]["parameters"] = self._clean_parameters(
self._parameters_schema.dump(self._parameters)
)
data_source_info["attributes"] = attributes
self._data_source = get_or_create_source(**data_source_info)
return self._data_source
def _clean_parameters(self, parameters: dict) -> dict:
"""Use this function to clean up the parameters dictionary from the
fields that are not to be persisted to the DB as data source attributes (when save_parameters=True),
e.g. because they are already stored as TimedBelief properties, or otherwise.
Example:
An DataGenerator has the following parameters: ["start", "end", "field1", "field2"] and we want just "field1" and "field2"
to be persisted.
Parameters provided to the `compute` method (input of the method `_clean_parameters`):
parameters = {
"start" : "2023-01-01T00:00:00+02:00",
"end" : "2023-01-02T00:00:00+02:00",
"field1" : 1,
"field2" : 2
}
Parameters persisted to the DB (output of the method `_clean_parameters`):
parameters = {"field1" : 1,"field2" : 2}
"""
raise NotImplementedError()

In this way, we can have multiple schedulers tied to the same sensors. In practice, this would work like this:

scheduler = StorageScheduler(config=config)
db.session.add(scheduler.data_source) # save scheduler config to DB, this has id 2

...

scheduler = DataSource.query.get(2).data_generator
schedule = scheduler.compute(parameters=parameters)

What do you think?

@Flix6x
Copy link
Contributor

Flix6x commented Aug 11, 2023

What do you think?

I agree, that information should go towards the DataSource.attributes. But perhaps the sensor or asset should still store some information about which data source / generator should be used as a default?

@nhoening
Copy link
Contributor Author

I don't agree that data sources are the correct and expected way to store this information about reality.

The data source is a place you use when you express how the source wants to interpreted reality. Like in simulations, I can see that case. But it is a bad modeling decision in terms of mental concepts, which matter a lot.

My proposal is that sensors and assets (and maybe sites) are the default places, and API calls and data sources can overwrite them (also partially).

@Flix6x
Copy link
Contributor

Flix6x commented Aug 11, 2023

The data source is a place you use when you express how the source wants to interpreted reality.

I don't understand especially this part.

@nhoening
Copy link
Contributor Author

I understood @victorgarcia98 this way. You want to store this information on the data source so you can simulate different scenario's. I phrased this as interpretations of reality.

I also recall you plan to create new data sources when this information changes, so as to keep perfect record about the context in which data was created.

Those are practical considerations. I understand them, even if I don't really see the 2nd being very clean or easy to follow.

My consideration is how natural this modeling feels. Any developer would expect this info on the assets or sensors.

@Flix6x
Copy link
Contributor

Flix6x commented Aug 11, 2023

Thanks for clarifying. I agree it makes sense that the (current) defaults are specified on the assets/sensors. Specifically: the device's flex model as a sensor attribute, and the device's flex context as an asset attribute (because this context is potentially shared with other flexible devices).

@victorgarcia98
Copy link
Contributor

I agree, that information should go towards the DataSource.attributes. But perhaps the sensor or asset should still store some information about which data source / generator should be used as a default?

Good idea! That way we can keep the trigger schedule based on the sensor id.

@victorgarcia98
Copy link
Contributor

victorgarcia98 commented Aug 12, 2023

I don't agree that data sources are the correct and expected way to store this information about reality.

I'm not against keeping most of the information in the Asset or Sensor attributes, specially when we are talking about physical parameters: capacity_in_mw, soc_max (battery capacity), etc...

Nonetheless, I see some benefits of keeping the description of the scheduler as DataSource attributes:

  • In a MIMO setting, where we have to schedule multiple devices, to which of the sensors would you attribute the schedule to? Perhaps, this could be on a GridConnection asset.
  • In relation to the meaning of DataSource, I understand a DataSource as the origin of data, where or what generated it. In the case of a schedule, the origin of the data is the scheduler class itself + parameters involved. For example, the scheduler config could contain the topology, which could change (e.g. unavailable PV). An alternative would be to tie the schedule to a "Grid Connection" asset. Another example is the prefer_charging_sooner parameter. Other data such as the initial SOC, should definitely be tied to the asset/sensor attributes .
  • We can keep track of the change in the parameters as the resulting schedules would have a different DataSource.

More in general, we need to decide where each of the fields in flex-context and flex-model will be:

  • config (DataSource)
  • parameters (DataSource)
  • Asset
  • Sensor

@nhoening
Copy link
Contributor Author

Instead of a GridConnection, I propose to use Site in #810.

I agree that these attributes are not all the same type of reality. Some are about physics (e.g. max_soc), some about preferences (e.g. prefer_charging_sooner). [I haven't looked at them all for now]

Either the default info is on the physical model (sensor/asset/site), and schedulers (data sources) can save custom values (overwriting standard values on the physical model) to all the preferential settings.
Or schedulers can update everything.

Those are the two directions I see here, and both have merits.

Something I came across for motivation (from Kill It With Fire - Manage Ageing Computer Systems):

"The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations."

And people will feel feel that the physical things are the main or only places they think of when modeling.

@nhoening
Copy link
Contributor Author

Instead of a GridConnection, I propose to use Site in #810.

I agree that these attributes are not all the same type of reality. Some are about physics (e.g. max_soc), some about preferences (e.g. prefer_charging_sooner). [I haven't looked at them all for now]

Either the default info is on the physical model (sensor/asset/site), and schedulers (data sources) can save custom values (overwriting standard values on the physical model) to all the preferential settings.
Or schedulers can update everything.

Those are the two directions I see here, and both have merits.

Something I came across for motivation (from Kill It With Fire - Manage Ageing Computer Systems):

"The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations."

And people will feel feel that the physical things are the main or only places they think of when modeling.

Btw, this issue should become a discussion first,right?

@victorgarcia98
Copy link
Contributor

Instead of a GridConnection, I propose to use Site in #810.

In either case (keeping it in the Asset or Site level), I would work towards supporting hierarchical descriptions. This might be useful for campus-like installations where there might be many internal power constraints (building level, zone level, campus level).

Somehow, I lean more towards making the Schedulers its own entity, similarly to what we did with the Reporters. On one side, we can control what parameter change require the creation of a new DataSource, which allows tracking changes. On the other, we can reuse the same Scheduler for multiple sites.

For example, in a hypothetical future scheduler that optimizes the participation in different ancillary services (FCR, aFRR, ...), we could define different schedulers to participate in different markets (e.g. just FCR or just FCR and aFRR). Maybe for this we need an entity to represent a Pool of assets (which could be in turn, an asset).

Either the default info is on the physical model (sensor/asset/site), and schedulers (data sources) can save custom values (overwriting standard values on the physical model) to all the preferential settings.

Regarding fallback mechanisms, I think they are convenient not to pass all the parameters but also they can induce a lot of confusion and lack of traceability.

"The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations."

Good quote!

Btw, this issue should become a discussion first,right?

I agree, this became a more transversal discussion.

@FlexMeasures FlexMeasures locked and limited conversation to collaborators Aug 21, 2023
@victorgarcia98 victorgarcia98 converted this issue into discussion #827 Aug 21, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

3 participants