Skip to content

KonstantinosNikolakakis/Information-Threshold

Repository files navigation

Information-Threshold

Efficient learning of hidden tree-structures from noisy data. Model: Tree-structured Markov Random Fields.

Last revision: Feb 2021 (Matlab R2019b) Author: Konstantinos Nikolakakis

The purpose of this code is to support the paper: "Optimal Rates for Learning Hidden Tree Structures" by Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Anand D. Sarwate.

Any part of this code used in your work should cite the above publication. This code is provided "as is" to support the ideals of reproducible research. Any issues with this code should be reported by email to k.nikolakakis@rutgers.edu.

The code is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License available at https://creativecommons.org/licenses/by-nc-sa/4.0/.

Description: Structure_Learning.m generates a tree-structured (over p nodes with binary values) and synthetic samples based on that model. Structure estimation is performed on the noiseless and noisy data of the underlying model. Noise is generated by a binary symmetric channel. For each iteration, the original tree structure T is generated randomly where, starting from the root, we choose the parent of each new node uniformly at random among the nodes that are currently in the tree, in a sequential fashion. We estimate the hidden structure by considering the Chow-Liu algorithm with input synthetic data. We find the error-rate by comparing the set of edges of the original and estimated tree. The values of the information threshold are found by the closed form expression of the mutual information, that is a function of correlations and the coss-over probability q. Specifically, for each value q, we change the values of all interaction parameters (including strong and weak edges), for each change we get a new point (new value for the information threshold). Further, the change in q also affects the value of the (noisy) information threshold, thus the points in the X axis in figure noisy.pdf are not identical for each curve. Our goal is to graphical illustrate the decay of error-rate (probability of missing at least one edge of the tree) for different values of the information threshold. The exponential decay of the error-rate when the information treshold increases supports the theoretical results of the paper.

Requires: UndirectedMaximumSpanningTree.m by Guangdi Li (2009). Maximum Weight Spanning tree (Undirected), MATLAB Central File Exchange. Revised by Konstantinos Nikolakakis (2019).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages