This paper considers the problem of quantifying the quality of a model selection problem for a graphical model. The model selection problem often uses a distance measure such as the Kulback-Leibler (KL) distance to quantify the quality of the approximation between the original distribution and the model distribution. We extend this work by formulating the problem as a detection problem between the original distribution and the model distribution. In particular, we focus on the covariance selection problem by Dempster, [1], and consider the cases where the distributions are Gaussian distributions. Previous work showed that if the approximation model is a tree, that the optimal tree that minimizes the KL divergence can be found by using the Chow-Liu algorithm [2]. While the algorithm minimizes the KL divergence it does not minimize other measures such as other divergences and the area under the curve (AUC). These measures all depend on the eigenvalues of the correlation approximation measure (CAM). We find expressions for KL divergence, log-likelihood ratio, and AUC as a function of the CAM. Easily computable upper and lower bounds are also found for the AUC. The paper concludes by computing these measures for real and synthetic simulation data.
Keywords:
Subject: Computer Science and Mathematics - Computer Science
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.