Skip to content
This repository has been archived by the owner on Oct 6, 2021. It is now read-only.
/ natPSOHO Public archive

Natural Encoding Particle Swarm Optimization Higher-Order Dynamic Bayesian Network Structure Learning in R

License

Notifications You must be signed in to change notification settings

dkesada/natPSOHO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

natPsoho

This is an extension of the particle swarm optimization structure learning algorithm for higher-order dynamic Bayesian networks from Santos and Maciel (https://doi.org/10.1109/BRC.2014.6880957) to an integer number encoding that supports multiple orders at the same time. The original algorithm can be found in both the ‘dbnR’ package (https://github.com/dkesada/dbnR) and in its own repository (https://github.com/dkesada/PSOHO). I couldn’t fork my own repository with the implementation for the original algorithm, so I imported it into a new one. I will use it as a template and extend it to the integer number encoding.

The main objective of this new encoding is to be able to learn the appropriate Markovian order of the network and the structure at the same time instead of having to fix the order beforehand. In this extension, you can fix a maximum order and the algorithm will search structures up to that point. The new encoding offers integer number vectors whose size only depend on the number of variables in t0, not on the order of the network. It also offers a way to favour smaller orders by fixing a geometric distribution that defines how often higher order arcs are added.

The algorithm is now finished and tested. I represents a big improvement over the original binary version and it scales much better for higher orders. Some features are still work in progress, but for the most part it’s done. Now that the implementation of the algorithm is finished and it has been included in the ‘dbnR’ package, further development in this repository will be halted.

Beware that this version of the algorithm uses an experimental version of the bnlearn package (https://github.com/dkesada/bnlearn_exp) that I modified to bypass security checks and speed the computations considerably, so if for some reason someone wants to use this specific version of natPSOHO, the experimental version of bnlearn has to be installed from GitHub too. The dbnR version of this algorithm uses regular bnlearn, so it is much easier to use.

A conference article that explains the natPSOHO algorithm in detail has also been made:

  • Quesada, D., Bielza, C., & Larrañaga, P. (2021, September). Structure Learning of High-Order Dynamic Bayesian Networks via Particle Swarm Optimization with Order Invariant Encoding. In International Conference on Hybrid Artificial Intelligence Systems (pp. 158-171). Springer, Cham.

It can be found on-line as a book chapter from the proceedings (https://link.springer.com/chapter/10.1007/978-3-030-86271-8_14) and as a preprint on ResearchGate (https://www.researchgate.net/publication/354602610_Structure_Learning_of_High-Order_Dynamic_Bayesian_Networks_via_Particle_Swarm_Optimization_with_Order_Invariant_Encoding)

Some conclusions and remarks

Similarities with the binary case

The new encoding is similar to the binary case in that the integer vectors can be translated into binary ones with some maximum order and it could be argued that both approaches are equivalent. However, the integer representation of positions and velocities offers the advantage that there is really no need to fix any order or any maximum order. The maximum order is useful to perform the search in a promising space rather than exactly that of the given Markovian order. Also, the addition of different probabilities to add arcs depending on the order prevents the network from freely searching in a bigger search space in favor of graphs with more populated lower orders.

The performance and efficiency are also heavily improved in the new algorithm. This is mainly due to three reasons:

  • Encoding the structure of the network in vectors whose length is quadratic on the number of nodes in only t_0 means that no matter the order of the desired network, the size of the particles will remain constant. This helps the algorithm greatly when dealing with higher orders, where as in the original algorithm the causal lists get bigger with each order increase.

  • Performing bitwise operations over an integer is much faster than iterating through lists and operating each value independently.

  • Both of my implementations use ‘Rcpp’ and switch to C++ for operating over the particles One implementation uses named Rcpp::List and the other uses only Rcpp::NumericVector, but when implementing the new algorithm as a fork of the old one I saw a lot of old parts that could be improved to be more efficient. Those inefficiencies are still present in the old algorithm, and so the new one is slightly improved in some sections.

About

Natural Encoding Particle Swarm Optimization Higher-Order Dynamic Bayesian Network Structure Learning in R

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published