Skip to content

clementetienam/Ultra-Fast-Mixture-of-Experts-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ultra-fast Mixture of Experts Regression:

Mixtures of experts with have become an indispensable tool for flexible modellingin a supervised learning context, and sparse Gaussian processes (GP) have shown promise as a leading candidate for the experts in such models. In the present article, we propose to design the gating network for selecting the experts from such mixtures of sparse GPs using a deep neural network (DNN). This combination provides a flexible, robust, and efficient model which is able to significantly outperform competing models. We furthermore consider efficient approaches to compute maximum a posteriori (MAP) estimators of these models by iteratively maximizing the distribution of experts given allocations and allocations given experts. We also show that a recently introduced method called Cluster-Classify-Regress (CCR) is capable of providing a good approximation of the optimal solution extremely quickly. This approximation can then be further refined with the iterative algorithm.

Getting Started:

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites:

MATLAB 2017 upwards

Methods:

Three methods are available for the supervised learning problem;

  • CCR: 1 pass Sparse Gaussian process experts with a Deep Neural network gating network
  • MM: Iterative Sparse Gaussian process experts with a Deep Neural network gating network with K-means for initialisation of latent variables and inducing points
  • MMr: Iterative Sparse Gaussian process experts with a Deep Neural network gating network with random initialisation of latent variables and K-means for initialising the inducing points

Datasets

Running the Numerical Experiment: Run the script TRAINING.m and following the prompts on the screen would guid you into using your ows dataset. For Prediction, Simply run the PREDICTING.m script, adapting it to the Test input dataset

Dependencies

  • CCR,MM and MMr requires the : GPML [12] in addition to my own library of utility functions (CKS/CKS_DNN/RFS)
  • The CKS's ;library contains necessary scripts for visualisation, Running the Neural Network and computing some hard and soft predictions

All libraries are included for your convenience.

Manuscript

-Clement Etienam, Kody Law, Sara Wade. Ultra-fast Deep Mixtures of Gaussian Process Experts. arXiv preprint arXiv:2006.13309, 2020. Extras

Extra methods are included also;

  • Running supervised learning models with DNN and MLP alone (Requires the netlab and MATLAB DNN tool box)
  • Running CCR/MM and MMr with DNN/DNN for the experts and gates respectively (Requires MATLAB DNN toolbox)
  • Running the MMr method for using Sparse Gp experts/DNN experts/RF experts and DNN/RF gates
  • Running CCR/CCR-MM and MM-MM with RandomForest Experts and RandomForest Gates. This method is fast and also gives a measure of uncerntainty

Author:

Dr Clement Etienam- Research Officer-Machine Learning. Active Building Centre

Acknowledgments:

References:

[1] Luca Ambrogioni, Umut Güçlü, Marcel AJ van Gerven, and Eric Maris. The kernel mixture network: A non-parametric method for conditional density estimation of continuous random variables. arXiv preprint arXiv:1705.07111, 2017.

[2] Christopher M Bishop. Mixture density networks. 1994.

[3] Isobel C. Gormley and Sylvia Frühwirth-Schnatter. Mixtures of Experts Models. Chapman and Hall/CRC, 2019.

[4] R.B. Gramacy and H.K. Lee. Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483):1119–1130, 2008.

[5] Robert A Jacobs, Michael I Jordan, Steven J Nowlan, Geoffrey E Hinton, et al. Adaptive mixtures of local experts. Neural computation, 3(1):79–87, 1991. 2

[6] Michael I Jordan and Robert A Jacobs. Hierarchical mixtures of experts and the em algorithm. Neural computation, 6(2):181–214, 1994.

[7] Trung Nguyen and Edwin Bonilla. Fast allocation of gaussian process experts. In International Conference on Machine Learning, pages 145–153, 2014.

[8] Carl E Rasmussen and Zoubin Ghahramani. Infinite mixtures of gaussian process experts. In Advances in neural information processing systems, pages 881–888, 2002.

[9] Tommaso Rigon and Daniele Durante. Tractable bayesian density regression via logit stickbreaking priors. arXiv preprint arXiv:1701.02969, 2017.

[10] Volker Tresp. Mixtures of gaussian processes. In Advances in neural information processing systems, pages 654–660, 2001.

[11] Lei Xu, Michael I Jordan, and Geoffrey E Hinton. An alternative model for mixtures of experts.

[12] Rasmussen, Carl Edward and Nickisch, Hannes. Gaussian processes for machine learning (gpml) toolbox. The Journal of Machine Learning Research, 11:3011–3015, 2010

[13] David E. Bernholdt, Mark R. Cianciosa, David L. Green, Jin M. Park, Kody J. H. Law, and Clement Etienam. Cluster, classify, regress: A general method for learning discontinuous functions. Foundations of Data Science, 1(2639-8001-2019-4-491):491, 2019.

[14] Clement Etienam, Kody Law, Sara Wade. Ultra-fast Deep Mixtures of Gaussian Process Experts. arXiv preprint arXiv:2006.13309, 2020.