Skip to content

forgi86/lru-reduction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Model order reduction of deep structured state-space models: A system-theoretic approach

This repository contains the Python code to reproduce the results of the paper Model order reduction of deep structured state-space models: A system-theoretic approach by Marco Forgione, Manas Mejari, and Dario Piga.

Linear Recurrent Unit

The Linear Recurrent Unit (LRU) is a sequence-to-sequence model defined by a linear time-invariant (LTI) dynamical system and implemented in state-space form as:

$$\begin{align} x_{k} = A_Dx_{x-1} + B u_k\\\ y_k = \Re[C x_k] + D u_k, \end{align}$$

where $A_D$ is diagonal and complex-valued; $B, C$ are full complex-valued; $D$ is full real-valued; and $\Re[\cdot]$ denotes the real part of its argument.

Smart parameterization/initialization of the system matrices make the LRU block easy to train numerically. Moreover, the use of parallel scan algorithms makes execution extremely fast on modern hardware. For more details, read the original LRU paper from Deep Mind.

Deep LRU Architecture

LRU blocks are organized in a deep LRU architecture which looks like:

This Norm-Recurrence-StaticNL-Skip architecture is close to the one introduced in the S4 paper, except for inner details of the LTI block. It has also analogies with dynoNet, where LTI blocks described in transfer-function form are interleaved with static non-linearities. Finally, it is also somewhat related to a decoder-only Transformer, with information shared across the time steps by an LTI system instead of a causal attention layer.

Model order reduction and regularization

We use Model Order Reduction (MOR) to reduce the state dimensionality of deep LRU architectures. We implement state-space truncation and singular value perturbation for the system either in modal or in balanced form, resulting in the combinations:

  • Balanced Truncation (BT)
  • Balanced Singular Perturbation (BSP)
  • Modal Truncation (MT)
  • Modal Singular Perturbation (MSP)

We intruduce regularization techniques that promote parsimonious state-space representations, in particular:

  • LASSO ($\ell_1$-norm) of the eigenvalues magnitude
  • Hankel nuclear norm

Results

We show on the F-16 ground vibration dataset that, when training is performed with these regularizers, the subsequent MOR step is significantly more effective.

Main files

The main files are:

The training script uses hydra to handle different configuragion. For instance, the model trained with Hankel nuclear norm minimization

python train.py +experiment=larg_reg_hankel

The configuration files defining the experiments are in the conf folder.

Software requirements

Experiments were performed on a Python 3.11 conda environment with:

  • numpy
  • scipy
  • matplotlib
  • python-control
  • pytorch (v2.2.1)

Citing

If you find this project useful, we encourage you to:

  • Star this repository ⭐

  • Cite the paper

@article{forgione2024model,
  title={Model order reduction of deep structured state-space models: A system-theoretic approach},
  author={Forgione, Marco and Mejari, Manas and Piga, Dario},
  journal={arXiv preprint arXiv:2403.14833},
  year={2024}
}