Multi scenario 15 May 2020 meeting notes

Multi scenario 15-May-2020 meeting notes

Current Pynsim structure

pynsim

The current implementation of Pynsim allows the user to:

Define a set of Nodes and Links, eventually grouped by Institutions, all together interconnected by a Network. The Pynsim Components (Nodes, Links, Networks and Institutions) can be subclassed to create bespoke Classes and Objects to be used inside the users models
Subclass the Engine class defining one or more engines that can manipulate the network’s components, nodes and links, altering hence then network status.
Define a simulation, passing variables and scenarios to the components defined in step 1, and assigning the engines defined in 2, and then running the engines in order.

All together the simulation runs the engines in a predefined order, the same order of the adding, which will operate over the defined components using the passed variables and scenarios.

Usage of Pynsim

Actually the usage of Pynsim in the group seems not properly based upon the design, and this is what I got from the meeting.

The user defines a model: The engines/models will contain both code and data (variables and scenarios). The nodes are bespoken designed inside the model itself without using the Pynsim components. The model defines nodes using Pynsim components JUST as a sharing (input/output) area for data. These nodes are used to store the result of any engine elaboration to be passed the relatively following engine.
The engines are inter-dependant but the interaction is not necessary unidirectional and cannot be predetermined with just an array ordering.
They tried implementing multiscenario with their current architecture: every model/engine contains internally all the matrix of the data of the first engine multiplied by the second by the third and so on. Because every engine is indipendently developed, they faced problems in mantaining this structure coherent betweem multiple iterations because the matrix continues to grow up in dimensions at every iteration even if not needed.

Moea

In the single scenario configuration, this system allows them to provide different variables values at every iteration, whilst using the same scenario in every iteration. This allows the system to find a local best solution comparing the results of every iteration. To improve performances, the library splits every single instance of simulation on different CPU cores/threads, using potentially all the available cores/threads. In this situation parallelizing multi-scenario execution becomes not possible because the MOEA already uses all the HW resources.

A multiscenario controller, that acts as a master controller to run indipendent simulations, is not therefore what they are in need.

Proposed solution

At this point, because the engines/models struggle to manage indipendent scenarios, they are in need of something that act as a black box storage, with internal indexes, that atcs as a bi-dimensional iterator, that will contain:

Scenarios for all the engines/models
Nodes Status at the start and the end of all the timesteps for each scenario.

We need to implement a new class “Multiscenario”, which every engine “get” and “set“ data for all the scenarios.

Possible implementation

1: In case of shared memory environment for all the engines/models, the class will be instanced as a singleton object, shared by all the engines/models. 2: In case that the memory is not shared, the class will be instanced as a singleton object exposing an API which will be called synchronously by the various engines.

Note

Let’s keep in mind that what is following defined is executed for every MOEA single variables settings.

Setup of the object

Every engine, in its setup phase, will pass to the “multiscenario” object (MSO):

MOEA current iteration ID
Engine ID
All the variables defined for the engine
All the Scenarios for the engine
All the current values for every component (node, link, etc) of the engine.

The MSO will combine all the scenarios for all the registered engines, creating a multi-scenario internal representation. This table will be fully serialized to DB, while the first two timestep rows will be mantained in memory for performance reasons.

Simulation run

Every engine will start each iteration asking to the MSO, for each component, the next timestep (the MSO manages them) and the next scenario(the MSO manages them) for this timestep (in the first iteration, “next” means “first”), and it will return:

the current status of the resource (node, etc) for the Nuple (iteration, timestep, scenario)
the next scenario value
the next Timestep value

passing:

MOEA iteration ID
Engine ID
Component ID

Using these values, the engine will setup the component status with the received value, use the current scenario value, and therefore applying itself to the component. Whenever this execution took place, the result is returned the MSO using:

MOEA iteration ID
Engine ID
Component ID
Final status of the component

In this way is possible running all the simulations in rows ordering, meaning using all the scenarios for any timestep, before passing to the following timestep.

Setting the value of a component in this way will have the meaning of the END of the current iteration for the (component, scenario, timestep) N-uple.

Possible ways to set data on a node

#inside engine X
node.flow = 2
#in the node
def __setattr__(self, value):
    #somehow get the engine X
    engineid = get_engine_id()???
    self.moea_data_manager('flow', engineid, value, self.simulation.MOEAID)

OR

#inside engine X
flow = 2
engineid = self.id
self.moea_data_manager.set('flow', engineid, flow, self.simulation.MOEAID)

Performances

Because the overall data matrix can become really big, and the memory could suffer for this, the MSO will always contains the smallest possible set of data in RAM, releasing them to DB whenever they are not needed anymore Ex: If we are in this situation:

MOEA ID : M
Engine ID: E
Component ID: C
Timestep: T

MSO will hold in memory just data for (M, E, C, All scenario(T)): these values that will be the input of the current timestep iteration. Each value will be indipendently serialized to DB whenever used by the following iteration. (M, E, C, All scenario(T+1)): these values that will contain the results of the current timestep iteration (M, E, C, All scenario(T+2)): these values are preloaded for performances reasons.

All the old data are automatically released from RAM and the next needed data will be loaded automatically from DB when needed.

Note

We can deduct that the MSO will act as an arbiter, avoiding the engines to run on timesteps that are too far from each other.

Moreover, no engines use any data of the T-2 timestep, and all the previous ones, when running the T iteration.

EDIT 1 june 2020

After the call, we realized that the PynSim component are used to create new component inside the engines.

The PynSim native components are used just as exchange buffers.

MInutes:

iteration are ALWAYS done inside the same timesteps

add a class inside the simulator object (that is shared between all engines) to:

store the scenarios
trigger a node evaluation start => The node status will be replaced with the right values for the current evaluation
trigger a node evaluation end => The node status will be saved in order to be used by the next timestep evaluation
No need to store all the history of each node. This is a job done by the single engine in case they need it.
Using multiple iterations of the single evaluation will be needed to restore the initial value of the node multiple time.

As said we just need 2 timesteps of data:

current status
resulting status of evaluation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi scenario 15 May 2020 meeting notes

Multi scenario 15-May-2020 meeting notes

Current Pynsim structure

Usage of Pynsim

Moea

Proposed solution

Possible implementation

Note

Setup of the object

Simulation run

Possible ways to set data on a node

Performances

Note

EDIT 1 june 2020

Clone this wiki locally