A set of routines to enable construction of completely unstructured multifidelity surrogate models for fusing multiple information sources. For a detailed background of how it works please see
- Gorodetsky, Alex A., John D. Jakeman, and Gianluca Geraci. “MFNets: data efficient all-at-once learning of multifidelity surrogates as directed networks of information sources.” Computational Mechanics 68.4 (2021): 741-758. http://arxiv.org/abs/2008.02672
For the autogenerated documentation (using pdoc) please see here.
This library depends on
The main routines are in net.py
, net_torch.py
and net_pyro.py
. The latter two utilize PyTorch for enabling arbitrary node and edge models, and provide a probabilistic inference capability, respectively. There are tests in test_mfnet.py
and in test_mfnet_torch.py
.
Essentially, the process for building and training a surrogate has two steps
- Setup a network
- Train a network
A multifidelity surrogate is defined by a set of functions along the nodes and edges. Each of these functions can be user specified. Below is an example of using linear functions along the nodes and edges for the case where training data comes form eight information sources.
def lin(param, xinput):
"""A linear parametric model
Parameters
----------
param : np.ndarray (nparams)
The parameters of the model
xinput : np.ndarray (nsamples,nparams)
The independent variables of the model
Returns
-------
vals : np.ndarray (nsamples)
Evaluation of the linear model
grad : np.ndarray (nsamples,nparams)
gradient of the linear model with respect to the model parameters
"""
print(param.shape,xinput.shape)
one = np.ones((xinput.shape[0], 1))
grad = np.concatenate((one, xinput), axis=1)
return param[0] + np.dot(param[1:], xinput.T), grad
Next we setup a network for the surrogates
def make_graph_8(nnode_param=2, nedge_param=2, linfunc=lin):
"""A graph with 8 nodes
3 -> 7 -> 8
^
|
1 -> 4
/ ^
/ |
2 -> 5 -> 6
"""
graph = nx.DiGraph()
pnodes = np.random.randn(10, nnode_param)
pedges = np.random.randn(8, nedge_param)
for node in range(1, 9):
graph.add_node(node, param=pnodes[node-1], func=linfunc)
graph.add_edge(1, 4, param=pedges[0, :], func=linfunc)
graph.add_edge(2, 5, param=pedges[1, :], func=linfunc)
graph.add_edge(5, 6, param=pedges[2, :], func=linfunc)
graph.add_edge(6, 4, param=pedges[3, :], func=linfunc)
graph.add_edge(3, 7, param=pedges[4, :], func=linfunc)
graph.add_edge(7, 8, param=pedges[5, :], func=linfunc)
graph.add_edge(4, 8, param=pedges[6, :], func=linfunc)
graph.add_edge(5, 4, param=pedges[7, :], func=linfunc)
roots = set([1, 2, 3])
return graph, roots
Next, we convert the graph into a multifidelity surrogate.
from net import MFSurrogate
num_nodes = 8
graph, roots = make_graph_8(2, 2, linfunc=lin)
surr = MFSurrogate(graph, roots) # create the surrogate
param0 = surr.get_param() # get the initial parameters (randomized)
# The script below is training
# all_nodes -> list of node indices for which data is available
# input_train -> list of input features for the nodes in all_nodes
# ouput_train -> list of the output for each of the nodes
# std -> list of standard deviations of the errors for each of the training sets
surr_learned = surr.train(param0, all_nodes, input_train, output_train, std, niters=400, verbose=False, warmup=True)
# Get evaluations of the highest fidelity model
# samples should be some inputs at which to evaluate the model
evals_hf = surr_learned.forward(samples, num_nodes)
evals_surr = surr_learned.get_evals() # can also get all the fidelity evaluations at *samples*
To clarify the training function signature , below I reproduce the documentation of the function
def train(self, param0in, nodes, xtrain, ytrain, stdtrain, niters=200,
func=least_squares,
verbose=False, warmup=True, opts=dict()):
"""Train the multifidelity surrogate.
This is the main entrance point for data-driven training.
Parameters
----------
param0in : np.ndarray (nparams)
The initial guess for the parameters
nodes : list
A list of nodes for which data is available
xtrain : list
A list of input features for each node in *nodes*
ytrain : list
A list of output values for each node in *nodes*
stdtrain : float
The standard devaition for data for each node in *nodes*
niters : integer
The number of optimization iterations
func : callable
A scalar valued objective function with the signature
``func(target, predicted) -> val (float), grad (np.ndarray)``
where ``target`` is a np.ndarray of shape (nobs)
containing the observations and ``predicted`` is a np.ndarray of
shape (nobs) containing the model predictions of the observations
verbose : integer
The verbosity level
warmup : boolean
Specify whether or not to progressively find a good guess before
optimizing
Returns
-------
Upon completion of this function, the parameters of the graph are set
to the values that best fit the data, as defined by *func*
"""
...
The PyTorch interface to MFNets provides significant flexibility to have arbitrary functional representations of nodes and edges.
Here we provide an example of using a PyTorch enabled MFNET. First the graph is defined, with each node and edge having a func
attribute which defines the model. In the below example they are all linear models, but they can be any PyTorch model.
Next we setup a network for the surrogates
def make_graph_8():
"""A graph with 8 nodes
3 -> 7 -> 8
^
|
1 -> 4
/ ^
/ |
2 -> 5 -> 6
"""
graph = nx.DiGraph()
dinput = 1
for node in range(1, 9):
graph.add_node(node, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(1, 4, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(2, 5, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(5, 6, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(6, 4, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(3, 7, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(7, 8, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(4, 8, func=torch.nn.Linear(dinput, 1, bias=True))
graph.add_edge(5, 4, func=torch.nn.Linear(dinput, 1, bias=True))
roots = set([1, 2, 3])
return graph, roots
Next, the output of the graph is used to initialize an MFNet
graph, roots = make_graph_8()
mfsurr = MFNetTorch(graph, roots)
One can evaluate any sequence of nodes at any inputs. For instance, let us evaluate the 2nd and 4th nodes at random locations
xtwo = torch.rand(10, 1)
xfour = torch.rand(4, 1)
y = mfsurr([xtwo, xfour], [2, 4]) # outputs a list of tensors representing the outputs
Training the model is similarly easy. First we setup loss functions corresponding to each node via
loss_fns = construct_loss_funcs(mfsurr) # list of loss functions ordered by node
Then training proceeds using a list of PyTorch DataLoader
instances (for an example see [here](mfnets_surrogates/test_mfnet_torch.py))
xtrain_two = torch.rand(4, 1)
xtrain_four = torch.rand(8, 1)
# Create random data for demonstration purposes
ytrain_two = xtrain_two.flatten()**2
ytain_four = xtrain_four.flatten() + 2
# ArrayDataset defined in net_torch
dataset2 = ArrayDataset(xtrain_two, ytrain_two)
dataset4 = ArrayDataset(xtrain_four, ytrain_four)
data_loaders = [torch.utils.data.DataLoader(dataset2, batch_size=4, shuffle=False),
torch.utils.data.DataLoader(dataset4, batch_size=8, shuffle=False)]
# get the loss functions corresponding to nodes 2 and 4
loss_fn_use = [loss_fn[1], loss_fn[3]]
# train
mfsurr.train(data_loaders, [2, 4], loss_fn_use)
The PyTorch example showed how to train a deterministic MFNet that does not account for the uncertainty in the node and edge functions that remains due to insufficient data. For this, we can use the Pyro probabilistic programming language. The setup for the graph is identical to the PyTorch example. However, we now instantiate the model as
# variance of noisy output is now a new parameter input
model = MFNetProbModel(graph, roots, noise_var=1e-2)
Being a probabilistic model, evaluations at the same locations yield different results
xtwo = torch.rand(10, 1)
xfour = torch.rand(4, 1)
# The two evaluations are different!
y_sample = model([xtwo, xfour], [2, 4])
y_sample = model([xtwo, xfour], [2, 4])
Multiple inference algorithms are possible. Please see the command line utility mfnet_cmd.py
for examples of how to run different algorithms. The data setup is identical as for the PYTorch training. However, the training procedure itself is different. For example, to run the NUTS sampler and generate predictive evaluations, one would use
from pyro.infer import MCMC, NUTS, Predictive,
nuts_kernel = NUTS(model, full_mass=True)
mcmc = MCMC(
nuts_kernel,
num_samples=5000,
warmup_steps=1000,
num_chains=1,
)
# Run the inference
mcmc.run(X, target_nodes, Y)
# Get the samples
param_samples = mcmc.get_samples()
# Convert samples to pandas dataframe for future processing
df = samples_to_pandas(param_samples)
# Create a predictive model which uses the samples from the posterior
predictive = Predictive(model, mcmc.get_samples())
# Evaluate the model over all samples from MCMC
predicted_vals = predictive([xtwo, xfour], [2, 4])
Similarly, to run a stochastic variational inference (SVI) procedure would would do
from pyro.infer import MCMC, NUTS, Predictive, SVI, Trace_ELBO
adam_params = {"lr": 0.005, "betas": (0.95, 0.999)}
optimizer = Adam(adam_params)
# variational distribution, e.g., full multivariate gaussian
guide = AutoMultivariateNormal(model)
# Run the inference
svi = SVI(model, guide, optimizer, loss=Trace_ELBO())
num_steps = 1000
for step in range(num_steps):
elbo = svi.step(X, target_nodes, Y)
if step % 100 == 0:
logging.info(f"Iteration {step}\t Elbo loss: {elbo}")
# Create a predictive model which uses the samples from the variational distribution
num_samples = 10000
predictive = Predictive(model, guide=guide, num_samples=num_samples)
pred = predictive([xtwo, xfour], [2, 4])
# Get the samples of the parameters
param_samples = {k: v.reshape(num_samples)
for k,v in pred.items() if k[:3] != "obs"}
df = samples_to_pandas(param_samples)
# Get the samples of the values
vals = {k: v for k,v in pred.items() if k[:3] == "obs"}
Please cite the following paper if you find this code to be useful
- Gorodetsky, Alex A., John D. Jakeman, and Gianluca Geraci. “MFNets: data efficient all-at-once learning of multifidelity surrogates as directed networks of information sources.” Computational Mechanics 68.4 (2021): 741-758.
Author: Alex Gorodetsky
Contact: goroda@umich.edu
Copyright (c) 2020 Alex Gorodetsky
License: MIT