Ability to store flex-model & flex-context on sensor & asset #809

nhoening · 2023-08-11T09:43:25Z

These two dicts should be looked up on sensor and/or asset by the scheduler (or other code that might need them). Then, sending each time per API call is not necessary.

This connects to #774 and #335, as well.

How would adding and editing work? Right now we can set dicts on attributes in the CLI and API I believe, and editing in the UI works. But maybe that support is not quite enough. For instance, I'd like schema checking these attributes, when they are so important as these two are.

victorgarcia98 · 2023-08-11T10:25:52Z

Regarding the first point, I would adapt the Schedulers to work in the DataGenerators framework:

flexmeasures/flexmeasures/data/models/data_sources.py

Lines 20 to 191 in 96e0a58

    
           class DataGenerator: 
        
               __data_generator_base__: str | None = None 
        
               _data_source: DataSource | None = None 
        
               _config: dict = None 
        
               _parameters: dict = None 
        
               _parameters_schema: Schema | None = None 
        
               _config_schema: Schema | None = None 
        
               _save_config: bool = True 
        
               _save_parameters: bool = False 
        
               def __init__( 
        
                   self, 
        
                   config: dict | None = None, 
        
                   save_config=True, 
        
                   save_parameters=False, 
        
                   **kwargs, 
        
               ) -> None: 
        
                   """Base class for the Schedulers, Reporters and Forecasters. 
        
                   The configuration `config` stores static parameters, parameters that, if 
        
                   changed, trigger the creation of a new DataSource.  Dynamic parameters, such as 
        
                   the start date, can go into the `parameters`. See docstring of the method `DataGenerator.compute` for 
        
                   more details. Nevertheless, the parameter `save_parameters` can be set to True if some `parameters` need 
        
                   to be saved to the DB. In that case, the method `_clean_parameters` is called to remove any field that is not 
        
                   to be persisted, e.g. time parameters which are already contained in the TimedBelief. 
        
                   Create a new DataGenerator with a certain configuration. There are two alternatives 
        
                   to define the parameters: 
        
                       1.  Serialized through the keyword argument `config`. 
        
                       2.  Deserialized, passing each parameter as keyword arguments. 
        
                   The configuration is validated using the schema `_config_schema`, to be defined by the subclass. 
        
                   `config` cannot contain the key `config` at its top level, otherwise it could conflict with the constructor keyword argument `config` 
        
                   when passing the config as deserialized attributes. 
        
                   Example: 
        
                       The configuration requires two parameters for the PV and consumption sensors. 
        
                       Option 1: 
        
                           dg = DataGenerator(config = { 
        
                               "sensor_pv" : 1, 
        
                               "sensor_consumption" : 2 
        
                           }) 
        
                       Option 2: 
        
                           sensor_pv = Sensor.query.get(1) 
        
                           sensor_consumption = Sensor.query.get(2) 
        
                           dg = DataGenerator(sensor_pv = sensor_pv, 
        
                                           sensor_consumption = sensor_consumption) 
        
                   :param config: serialized `config` parameters, defaults to None 
        
                   :param save_config: whether to save the config into the data source attributes 
        
                   :param save_parameters: whether to save the parameters into the data source attributes 
        
                   """ 
        
                   self._save_config = save_config 
        
                   self._save_parameters = save_parameters 
        
                   if config is None and len(kwargs) > 0: 
        
                       self._config = kwargs 
        
                       DataGenerator.validate_deserialized(self._config, self._config_schema) 
        
                   elif config is not None: 
        
                       self._config = self._config_schema.load(config) 
        
                   elif len(kwargs) == 0: 
        
                       self._config = self._config_schema.load({}) 
        
               def _compute(self, **kwargs) -> List[Dict[str, Any]]: 
        
                   raise NotImplementedError() 
        
               def compute(self, parameters: dict | None = None, **kwargs) -> List[Dict[str, Any]]: 
        
                   """The configuration `parameters` stores dynamic parameters, parameters that, if 
        
                   changed, DO NOT trigger the creation of a new DataSource. Static parameters, such as 
        
                   the topology of an energy system, can go into `config`. 
        
                   `parameters` cannot contain the key `parameters` at its top level, otherwise it could conflict with keyword argument `parameters` 
        
                   of the method compute when passing the `parameters` as deserialized attributes. 
        
                   :param parameters: serialized `parameters` parameters, defaults to None 
        
                   """ 
        
                   if self._parameters is None: 
        
                       self._parameters = {} 
        
                   if parameters is None: 
        
                       self._parameters.update(self._parameters_schema.dump(kwargs)) 
        
                   else: 
        
                       self._parameters.update(parameters) 
        
                   self._parameters = self._parameters_schema.load(self._parameters) 
        
                   return self._compute(**self._parameters) 
        
               @staticmethod 
        
               def validate_deserialized(values: dict, schema: Schema) -> bool: 
        
                   schema.load(schema.dump(values)) 
        
               @classmethod 
        
               def get_data_source_info(cls: type) -> dict: 
        
                   """ 
        
                   Create and return the data source info, from which a data source lookup/creation is possible. 
        
                   See for instance get_data_source_for_job(). 
        
                   """ 
        
                   source_info = dict( 
        
                       source=current_app.config.get("FLEXMEASURES_DEFAULT_DATASOURCE") 
        
                   )  # default 
        
                   source_info["source_type"] = cls.__data_generator_base__ 
        
                   source_info["model"] = cls.__name__ 
        
                   return source_info 
        
               @property 
        
               def data_source(self) -> "DataSource": 
        
                   """DataSource property derived from the `source_info`: `source_type` (scheduler, forecaster or reporter), `model` (e.g AggregatorReporter) 
        
                   and `attributes`. It looks for a data source in the database the marges the `source_info` and, in case of not finding any, it creates a new one. 
        
                   This property gets created once and it's cached for the rest of the lifetime of the DataGenerator object. 
        
                   """ 
        
                   from flexmeasures.data.services.data_sources import get_or_create_source 
        
                   if self._data_source is None: 
        
                       data_source_info = self.get_data_source_info() 
        
                       attributes = {"data_generator": {}} 
        
                       if self._save_config: 
        
                           attributes["data_generator"]["config"] = self._config_schema.dump( 
        
                               self._config 
        
                           ) 
        
                       if self._save_parameters: 
        
                           attributes["data_generator"]["parameters"] = self._clean_parameters( 
        
                               self._parameters_schema.dump(self._parameters) 
        
                           ) 
        
                       data_source_info["attributes"] = attributes 
        
                       self._data_source = get_or_create_source(**data_source_info) 
        
                   return self._data_source 
        
               def _clean_parameters(self, parameters: dict) -> dict: 
        
                   """Use this function to clean up the parameters dictionary from the 
        
                   fields that are not to be persisted to the DB as data source attributes (when save_parameters=True), 
        
                   e.g. because they are already stored as TimedBelief properties, or otherwise. 
        
                   Example: 
        
                       An DataGenerator has the following parameters: ["start", "end", "field1", "field2"] and we want just "field1" and "field2" 
        
                       to be persisted. 
        
                       Parameters provided to the `compute` method (input of the method `_clean_parameters`): 
        
                       parameters = { 
        
                                       "start" : "2023-01-01T00:00:00+02:00", 
        
                                       "end" : "2023-01-02T00:00:00+02:00", 
        
                                       "field1" : 1, 
        
                                       "field2" : 2 
        
                                   } 
        
                       Parameters persisted to the DB (output of the method `_clean_parameters`): 
        
                       parameters = {"field1" : 1,"field2" : 2} 
        
                   """ 
        
                   raise NotImplementedError()

In this way, we can have multiple schedulers tied to the same sensors. In practice, this would work like this:

scheduler = StorageScheduler(config=config)
db.session.add(scheduler.data_source) # save scheduler config to DB, this has id 2

...

scheduler = DataSource.query.get(2).data_generator
schedule = scheduler.compute(parameters=parameters)

What do you think?

Flix6x · 2023-08-11T11:13:39Z

What do you think?

I agree, that information should go towards the DataSource.attributes. But perhaps the sensor or asset should still store some information about which data source / generator should be used as a default?

nhoening · 2023-08-11T11:37:38Z

I don't agree that data sources are the correct and expected way to store this information about reality.

The data source is a place you use when you express how the source wants to interpreted reality. Like in simulations, I can see that case. But it is a bad modeling decision in terms of mental concepts, which matter a lot.

My proposal is that sensors and assets (and maybe sites) are the default places, and API calls and data sources can overwrite them (also partially).

Flix6x · 2023-08-11T12:05:11Z

The data source is a place you use when you express how the source wants to interpreted reality.

I don't understand especially this part.

nhoening · 2023-08-11T15:04:34Z

I understood @victorgarcia98 this way. You want to store this information on the data source so you can simulate different scenario's. I phrased this as interpretations of reality.

I also recall you plan to create new data sources when this information changes, so as to keep perfect record about the context in which data was created.

Those are practical considerations. I understand them, even if I don't really see the 2nd being very clean or easy to follow.

My consideration is how natural this modeling feels. Any developer would expect this info on the assets or sensors.

Flix6x · 2023-08-11T17:34:10Z

Thanks for clarifying. I agree it makes sense that the (current) defaults are specified on the assets/sensors. Specifically: the device's flex model as a sensor attribute, and the device's flex context as an asset attribute (because this context is potentially shared with other flexible devices).

victorgarcia98 · 2023-08-12T08:17:03Z

I agree, that information should go towards the DataSource.attributes. But perhaps the sensor or asset should still store some information about which data source / generator should be used as a default?

Good idea! That way we can keep the trigger schedule based on the sensor id.

victorgarcia98 · 2023-08-12T16:13:19Z

I don't agree that data sources are the correct and expected way to store this information about reality.

I'm not against keeping most of the information in the Asset or Sensor attributes, specially when we are talking about physical parameters: capacity_in_mw, soc_max (battery capacity), etc...

Nonetheless, I see some benefits of keeping the description of the scheduler as DataSource attributes:

In a MIMO setting, where we have to schedule multiple devices, to which of the sensors would you attribute the schedule to? Perhaps, this could be on a GridConnection asset.
In relation to the meaning of DataSource, I understand a DataSource as the origin of data, where or what generated it. In the case of a schedule, the origin of the data is the scheduler class itself + parameters involved. For example, the scheduler config could contain the topology, which could change (e.g. unavailable PV). An alternative would be to tie the schedule to a "Grid Connection" asset. Another example is the prefer_charging_sooner parameter. Other data such as the initial SOC, should definitely be tied to the asset/sensor attributes .
We can keep track of the change in the parameters as the resulting schedules would have a different DataSource.

More in general, we need to decide where each of the fields in flex-context and flex-model will be:

config (DataSource)
parameters (DataSource)
Asset
Sensor

nhoening · 2023-08-13T14:51:36Z

Instead of a GridConnection, I propose to use Site in #810.

I agree that these attributes are not all the same type of reality. Some are about physics (e.g. max_soc), some about preferences (e.g. prefer_charging_sooner). [I haven't looked at them all for now]

Either the default info is on the physical model (sensor/asset/site), and schedulers (data sources) can save custom values (overwriting standard values on the physical model) to all the preferential settings.
Or schedulers can update everything.

Those are the two directions I see here, and both have merits.

Something I came across for motivation (from Kill It With Fire - Manage Ageing Computer Systems):

"The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations."

And people will feel feel that the physical things are the main or only places they think of when modeling.

nhoening · 2023-08-13T14:54:01Z

Instead of a GridConnection, I propose to use Site in #810.

I agree that these attributes are not all the same type of reality. Some are about physics (e.g. max_soc), some about preferences (e.g. prefer_charging_sooner). [I haven't looked at them all for now]

Either the default info is on the physical model (sensor/asset/site), and schedulers (data sources) can save custom values (overwriting standard values on the physical model) to all the preferential settings.
Or schedulers can update everything.

Those are the two directions I see here, and both have merits.

Something I came across for motivation (from Kill It With Fire - Manage Ageing Computer Systems):

"The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations."

And people will feel feel that the physical things are the main or only places they think of when modeling.

Btw, this issue should become a discussion first,right?

victorgarcia98 · 2023-08-13T15:42:55Z

Instead of a GridConnection, I propose to use Site in #810.

In either case (keeping it in the Asset or Site level), I would work towards supporting hierarchical descriptions. This might be useful for campus-like installations where there might be many internal power constraints (building level, zone level, campus level).

Somehow, I lean more towards making the Schedulers its own entity, similarly to what we did with the Reporters. On one side, we can control what parameter change require the creation of a new DataSource, which allows tracking changes. On the other, we can reuse the same Scheduler for multiple sites.

For example, in a hypothetical future scheduler that optimizes the participation in different ancillary services (FCR, aFRR, ...), we could define different schedulers to participate in different markets (e.g. just FCR or just FCR and aFRR). Maybe for this we need an entity to represent a Pool of assets (which could be in turn, an asset).

Either the default info is on the physical model (sensor/asset/site), and schedulers (data sources) can save custom values (overwriting standard values on the physical model) to all the preferential settings.

Regarding fallback mechanisms, I think they are convenient not to pass all the parameters but also they can induce a lot of confusion and lack of traceability.

"The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations."

Good quote!

Btw, this issue should become a discussion first,right?

I agree, this became a more transversal discussion.

nhoening added the Scheduling label Aug 11, 2023

FlexMeasures locked and limited conversation to collaborators Aug 21, 2023

victorgarcia98 converted this issue into discussion #827 Aug 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Ability to store flex-model & flex-context on sensor & asset #809

Ability to store flex-model & flex-context on sensor & asset #809

nhoening commented Aug 11, 2023

victorgarcia98 commented Aug 11, 2023

Flix6x commented Aug 11, 2023 •

edited

nhoening commented Aug 11, 2023

Flix6x commented Aug 11, 2023

nhoening commented Aug 11, 2023

Flix6x commented Aug 11, 2023

victorgarcia98 commented Aug 12, 2023

victorgarcia98 commented Aug 12, 2023 •

edited

nhoening commented Aug 13, 2023

nhoening commented Aug 13, 2023

victorgarcia98 commented Aug 13, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

Ability to store flex-model & flex-context on sensor & asset #809

Ability to store flex-model & flex-context on sensor & asset #809

Comments

nhoening commented Aug 11, 2023

victorgarcia98 commented Aug 11, 2023

Flix6x commented Aug 11, 2023 • edited

nhoening commented Aug 11, 2023

Flix6x commented Aug 11, 2023

nhoening commented Aug 11, 2023

Flix6x commented Aug 11, 2023

victorgarcia98 commented Aug 12, 2023

victorgarcia98 commented Aug 12, 2023 • edited

nhoening commented Aug 13, 2023

nhoening commented Aug 13, 2023

victorgarcia98 commented Aug 13, 2023

This issue was moved to a discussion.

Flix6x commented Aug 11, 2023 •

edited

victorgarcia98 commented Aug 12, 2023 •

edited